Dynamic digital pre-distortion system

ABSTRACT

A Dynamic Digital Pre-Distortion (DDPD) system is disclosed to rapidly correct power amplifier (PA) non-linearity and memory effects. To perform pre-distortion, a DDPD engine predistorts an input signal in order to cancel PA nonlinearities as the signal is amplified by the PA. The DDPD engine is implemented as a composite of one linear filter and N−1 high order term linear filters. The bank of linear filters have programmable complex coefficients. To compute the coefficients, samples from the transmit path and a feedback path are captured, and covariance matrices A and B are computed using optimized hardware. After the covariance matrices are computed, Gaussian elimination processing may be employed to compute the coefficients. Mathematical and hardware optimizations may be employed to simplify and reduce the number of multiplication operands and other operations, which can enable the DDPD system to fit within a single chip.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a divisional of co-pending U.S. patent applicationSer. No. 13/567,724, filed on Aug. 6, 2012, entitled “DYNAMIC DIGITALPRE-DISTORTION SYSTEM,” now allowed, which is a divisional of U.S.patent application Ser. No. 13/198,891, filed on Aug. 5, 2011, entitled“DYNAMIC DIGITAL PRE-DISTORTION SYSTEM,” U.S. Pat. No. 8,259,843, whichis a divisional of U.S. patent application Ser. No. 11/788,451, filed onApr. 20, 2007, entitled “DYNAMIC DIGITAL PRE-DISTORTION SYSTEM,” U.S.Pat. No. 8,005,162, and is related to U.S. patent application Ser. No.11/150,445, filed on Jun. 9, 2005, entitled “DIGITAL PRE-DISTORTIONTECHNIQUE USING NONLINEAR FILTERS,” U.S. Pat. No. 7,606,322, the entirecontents of all of which are hereby incorporated by reference herein.

FIELD OF THE INVENTION

Embodiments of the invention relate to the linearization of non-linearsystems in general, and in particular embodiments, to the use ofnon-linear pre-distortion to linearize high power amplifiers in radiotransmitters used in communication systems such as cellular mobiletelephony.

BACKGROUND OF THE INVENTION

FIG. 1 a illustrates an exemplary cell site 2 for use within acommunications network such as a cellular communications network. Thecell site 2 includes a radio server 4 and a cell tower 12. The radioserver 4 includes a modem 6, modem software 8, a network interface, anoperation and maintenance processor and software, and soft handoffswitch software. The cell tower 12 includes one or more antennae 14mounted to the top of the cell tower for transmitting and receivingwireless communication signals.

The cell site 2 also includes one or more transceivers or radio heads(RHs) 16. Each RH 16 includes a power amplifier, digital-to-analog (D/A)converters, analog-to-digital (A/D) converters, radio frequency (RF)upconverter (UC), RF downconverter (DC), and digital signal processingcircuitry 10 for communicating over multiple network protocols. The RHs16 may be located within the same box as the radio server 4 (e.g. on oneor more cards in one or more slots in a rack-mounted configuration).When the radio server 4 and the RHs 16 are located within the same box,they may be referred to as “Node B” or as a basestation.

Alternatively, the RHs 16 may be located in a separate housing from theradio server 4 but connected to an antenna on the top of the cell towerthrough a lossy cable, or mounted at the top of the cell tower 12 nearthe antennae 14, which reduces the connection loss between the RHs andthe antenna. When the RHs 16 are not located in the same box as theradio server 4, they may be referred to as “remote” radio heads (RRHs).When the RRHs are located at the top of the cell tower 12, they may bereferred to as tower-mounted RRHs.

FIG. 1 b illustrates a cluster of cell sites 2 in which there is asingle radio server 4 connected by fiber optic lines 18 in a daisy chainor parallel configuration to multiple remote radio heads (RRHs) 20, eachRRH located at a different cell site. The RRHs 20 may be located at thebase of the cell tower 12 at each cell site 2 or alternatively at thetop of the cell tower in a tower-mounted configuration.

As mentioned above, the RHs and RRHs of FIGS. 1 a and 1 b contain poweramplifiers (PAs). The output power levels of the PAs may change overtime as a function of the number of users. In general, as the number ofusers increases or the amount of traffic increases (e.g. if multipleusers are downloading data), the output power levels increase. Inaddition, because each user is under power control, as the user getscloser to the cell site 2 or farther away from the cell site, the outputpower level transmitted to that user decreases or increases accordingly.

FIG. 2 illustrates an exemplary power amplifier characteristic curve ofinput power (x-axis) versus output power (y-axis). As FIG. 2illustrates, at higher input power levels the curve compresses at 60 andbecomes non-linear, so that the actual amount of output power is lessthan what is expected under ideal conditions. Besides this amplitudedistortion, the power amplifier exhibits non-linear dynamicscharacteristics otherwise known as “memory effect distortion” and phasedistortion. These four PA characteristics comprise the major PAdistortion effects and collectively cause output power “signaldistortion.”

In historical second generation (2G) cellular communication services,such as GSM, GPRS, or EDGE which uses GMSK or in the case of EDGE 3pi/8MSK modulations, class C PAs were used to amplify a modulated carrierwith a relatively high efficiency approaching 50% power added efficiency(PAE). No linearization of the output power versus input power curve wasrequired, because the output signal was provided at a constant amplitudeor very small peak-average-ratio (PAR) in the case of an EDGE signal.With current third generation (3G) cellular communication services,Gaussian-like signals are generated with large PAR, and class AB PAs orthe more efficient but highly non-linear Doherty PAs are required.

Current-generation PAs are generally expensive and show low DC to RFconversion efficiency and therefore account for the main part of theheat generated by transmitter systems. PAs not only generate non-lineardistortions but also possess memory effects that contribute to thenonlinear behavior significantly once the input excitation has wideinstantaneous bandwidth.

The transmit signal is a modulated signal and thus consists of variousfrequency contents, expressed as follows:

$\begin{matrix}{{x(t)} = {\sum\limits_{i}{x_{i}\left( {t,f_{i}} \right)}}} & (0.1)\end{matrix}$

When this signal is passed through the transmitter chain comprised of adigital to analog converter (DAC), radio frequency (RF) electronics andthe PA, the signal undergoes different distortions: (1) StaticNon-Linear Distortion (due to frequency translation from IF to RF andmore so in amplifier stage); (2) Non-linear Dynamic Distortion known asPA Memory Effect; (3) Amplitude distortions (due to non-idealfiltering); (4) Phase distortions (due to non-ideal filtering); and (5)Time Delay distortions (due to group delay variations in filtering).

In addition, the PA characteristics change with temperature. As thetransmit signal rapidly changes levels, the thermal effects of the PAchange, which cause the PA characteristics to change. Since the signalsource is typically dynamic and the amplitude can vary 5-10 dB within avery short period (e.g. for HSDPA, High Speed Downlink Packet Access),the PA gain and phase characteristics can change fairly rapidly.

Without linearization, the efficiency of the class AB PAs in 3G cellularcommunication services drops significantly and would be estimated to bearound 4%. Thus, there is a need to improve the efficiency of the PAs in3G cellular communication services.

Using analog techniques, efficiency can be improved to about 8%.Conventional digital techniques can raise this efficiency to about20-25% using Class AB power amplifiers. However, there is still a needto improve PA efficiency even more while maintaining good Channel PowerLeakage (CPL). When applying conventional DPD techniques with a highefficiency PA (such as Doherty PA), the CPL of the PA signal output isdegraded (and could fail the Spectral Emission Mask (SEM) requirement)especially for transitioning signals, where the signal can betransitioned from low power to high power in a rapid fashion. Therefore,the conventional approach is not practical for high efficiency PA's.

“Pre-distortion” is a known technique for applying a pre-distorted PAinput signal to a PA to cancel out or compensate for the inherentdistortion of the PA and improve the linearization and therefore theefficiency of the PA. However, previous digital implementations utilizeddigital signal processing (DSP) and software, which can be too slow forcurrent PAs that can experience rapid changes to power levels. Inaddition, any previous digital implementations were not optimized towork with highly non-linear PAs such as a Doherty pair nor would fit ona single chip.

SUMMARY OF THE INVENTION

Embodiments of the invention are directed to providing Dynamic DigitalPre-Distortion (DDPD) to rapidly correct PA non-linearity and memoryeffects. Objectives of this technique may include, but are not limitedto: (1) correcting the nonlinearity and memory effects of the PA; (2)handling the dynamic signal and adapting to changing PA characteristics;and (3) performing high speed updates to handle fast changing data andPA characteristics.

To perform pre-distortion according to embodiments of the invention, aDDPD engine predistorts an input signal in order to cancel PAnonlinearities as the signal is amplified by the PA. In effect, the DDPDengine produces the inverse of the PA characteristics dynamically. TheDDPD engine is implemented as a composite of one linear filter and N−1high order term linear filters. The input to the linear filter is theinput signal. Each high order term filter has as an input some power ofthe amplitude of the input and each can have a different number of tapsas well. The bank of linear filters have programmable complexcoefficients provided by a DDPD Coefficient Estimator. The effect of thefiltering is to predistort the input signal and cancel the PA distortionso that the baseband equivalent of the output of the PA is very close tobeing the same as the input signal.

The objective of the DDPD Coefficient Estimator is to compute a set ofDDPD Engine coefficients, W, used to predistort the transmit signal. Toaccomplish this, samples from the transmit path and a feedback path arecaptured, and Covariance matrices A and B are computed for the currentsignal levels using optimized hardware. After the covariance matricesare computed, Gaussian elimination processing may be employed to computethe DDPD Engine coefficients.

By utilizing embedded hardware operators rather than a discrete DSP, andtherefore, embodiments of the invention are “dynamic” in the sense thatthey are fast enough to handle rapidly changing signal power levels.Digital and analog logic for implementing multipliers, adders, filters,delays, A/D and D/A converters, buffering, upsampling, RAM,upconversion, downconversion, crossbars and the like, well-understood bythose skilled in the art, may be utilized according to embodiments ofthe invention. In addition, mathematical and hardware optimizations maybe employed to simplify and reduce the number of multiplication operandsand other operations, which can enable the DDPD system to fit within asingle chip.

Embodiments of the invention are designed to handle linearization ofnon-linear PAs, especially high efficiency PAs (such as Doherty PAs),which have large memory effects and composite nonlinearity, whilesupporting transitioning signals. Using this linearization technique, PAchain efficiency in excess of 40% is achievable.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 a illustrates an exemplary cell site with a co-located radioserver.

FIG. 1 b illustrates a cluster of cell sites in which there is a singleradio server connected to each cell site.

FIG. 2 illustrates an exemplary power amplifier characteristic curve ofinput power (x-axis) versus output power (y-axis) showing compression athigher power levels.

FIG. 3 a illustrates an exemplary cell site for use within acommunications network such as a cellular communications networkaccording to embodiments of the invention.

FIG. 3 b illustrates an exemplary cluster of cell sites in which thereis a single radio server connected by fiber optic lines in a daisy chainor parallel configuration to multiple RRHs, each RRH located at adifferent cell site according to embodiments of the invention.

FIG. 4 illustrates an exemplary plot of frequency versus output powerfor a multi-carrier signal showing the expected (ideal) output powerlevels and the actual output power levels as a result of output power“distortion.”

FIG. 5 is a frequency plot illustrating a simplified example ofpredistortion according to embodiments of the invention.

FIG. 6 illustrates a block diagram of an exemplary DDPD System accordingto embodiments of the invention.

FIG. 7 illustrates a block diagram of an exemplary Pre-DDPD SignalConditioner according to embodiments of the invention.

FIG. 8 a illustrates an exemplary DDPD Engine according to embodimentsof the invention.

FIG. 8 b illustrates an exemplary high order term filter that may beused in the exemplary DDPD Engine of FIG. 8 a according to embodimentsof the invention.

FIG. 9 illustrates an exemplary Post-DDPD Signal Conditioner accordingto embodiments of the invention.

FIG. 10 illustrates an exemplary implementation of the Fine Delay blockwhich can apply a one clock shift (at the actual data rate) to fourparallel data paths according to embodiments of the invention.

FIG. 11 illustrates exemplary components of the RF Transmit Up Converterblock according to embodiments of the invention.

FIG. 12 illustrates an exemplary Power Amplifier system according toembodiments of the invention.

FIG. 13 illustrates an exemplary RF Feedback block according toembodiments of the invention.

FIG. 14 illustrates an exemplary digital feedback processor according toembodiments of the invention.

FIG. 15 illustrates that g4 is applied to maintain constant FeedbackGain according to embodiments of the invention.

FIG. 16 illustrates an exemplary IFTBB processor that converts the IFsignal to a resampled baseband signal according to embodiments of theinvention.

FIG. 17 illustrates an exemplary DDPD/FB Correlator according toembodiments of the invention.

FIG. 18 illustrates an exemplary DDPD Coefficient Estimator according toembodiments of the invention.

FIGS. 19 a and 19 b show exemplary transmit and feedback signalamplitudes of the data before and after applying extrapolation scalingaccording to embodiments of the invention.

FIGS. 20 a and 20 b show an exemplary theoretical and practicalextrapolation extension of an amplitude-amplitude curve according toembodiments of the invention.

FIGS. 21 a and 21 b show an exemplary theoretical and practicalextrapolation extension of an amplitude-phase curve according toembodiments of the invention.

FIG. 22 illustrates an exemplary block diagram of the averaging,interpolation and extrapolation that is used to stabilize the DDDPcoefficient solution according to embodiments of the invention.

FIG. 23 shows an exemplary performance degradation if current solutionis based on data with peaks limited to 2000 in amplitude.

FIG. 24 illustrates the benefits of Linear extrapolation according toembodiments of the invention.

FIG. 25 illustrates another benefit of Linear Extrapolation for lowpower inputs according to embodiments of the invention.

FIG. 26 illustrates an exemplary implementation block diagram of thenormal matrices combining process according to embodiments of theinvention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

In the following description of preferred embodiments, reference is madeto the accompanying drawings which form a part hereof, and in which itis shown by way of illustration specific embodiments in which theinvention can be practiced. It is to be understood that otherembodiments can be used and structural changes can be made withoutdeparting from the scope of the embodiments of this invention.

Embodiments of the invention are directed to providing Dynamic DigitalPre-Distortion (DDPD) to rapidly correct PA non-linearity and memoryeffects. Objectives of this technique may include, but are not limitedto: (1) correcting the nonlinearity and memory effects of the PA; (2)handling the dynamic signal and adapting to changing PA characteristics;and (3) performing high speed updates to handle fast changing data andPA characteristics.

The architecture of the DDPD engine is based on the nonlinear models andanalysis found in “Nonlinear Microwave and RF Circuits,” 2^(nd) ed.,Stephen A. Maas, 2003) and “Nonlinear System Identification and Analysiswith Applications to Power Amplifier Modeling and Power AmplifierPredistortion,” Raviv Raich, PhD dissertation submitted March 2004, bothof which are incorporated by reference herein.

To perform pre-distortion according to embodiments of the invention, aDDPD engine predistorts an input signal in order to cancel PAnonlinearities as the signal is amplified by the PA. In effect, the DDPDengine produces the inverse of the PA characteristics dynamically. TheDDPD engine is implemented as a composite of one linear filter and N−1high order term linear filters. The input to the linear filter is theinput signal. Each high order term filter has as an input some power ofthe amplitude of the input and each can have a different number of tapsas well. The bank of linear filters have programmable complexcoefficients provided by a DDPD Coefficient Estimator. The effect of thefiltering is to predistort the input signal and cancel the PA distortionso that the baseband equivalent of the output of the PA is very close tobeing the same as the input signal.

The objective of the DDPD Coefficient Estimator is to compute a set ofDDPD Engine coefficients, W, used to predistort the transmit signal. Toaccomplish this, samples from the transmit path and a feedback path arecaptured, and Covariance matrices A and B are computed for the currentsignal levels using optimized hardware. After the covariance matricesare computed, Gaussian elimination processing may be employed to computethe DDPD Engine coefficients.

The above-mentioned complex operations are performed by utilizingembedded hardware operators rather than a discrete DSP, and therefore,embodiments of the invention are “dynamic” in the sense that they arefast enough to handle rapidly changing signal power levels. Digital andanalog logic for implementing multipliers, adders, filters, delays, A/Dand D/A converters, buffering, upsampling, RAM, upconversion,downconversion, crossbars and the like, well-understood by those skilledin the art, may be utilized according to embodiments of the invention.In addition, mathematical and hardware optimizations may be employed tosimplify and reduce the number of multiplication operands and otheroperations, which can enable the DDPD system to fit within a singlechip.

Embodiments of the invention are designed to handle linearization ofnon-linear PAs, especially high efficiency PAs (such as Doherty PAs),which have large memory effects and composite nonlinearity, whilesupporting transitioning signals. Using this linearization technique, PAchain efficiency in excess of 40% is achievable.

Although some embodiments of this invention may be described herein interms of improving the linearization of high power amplifiers in radiotransmitters (including RRH) located at cell sites, it should beunderstood that embodiments of this invention are not so limited, butare generally applicable to any non-linear system or elements.

FIG. 3 a illustrates an exemplary cell site 22 for use within acommunications network such as a cellular communications networkaccording to embodiments of the invention. The cell site 22 includes aradio server 24 and a cell tower 32. The radio server 24 includes amodem 26, modem software 8, a network interface, an operation andmaintenance processor and software, and soft handoff switch software.The cell tower 32 includes one or more antennae 34 mounted to the top ofthe cell tower for transmitting and receiving wireless communicationsignals. Note that although the example of FIG. 3 a shows six antennae34 representing six sectors, any number of antennae and sectors may bepresent.

The cell site 22 also includes one or more transceivers or RHs 36. EachRH 36 includes a power amplifier, D/A converters, A/D converters, RFupconverter (UC), RF downconverter (DC), and DDPD system 42. The RHs 36may be located within the same box as the radio server 24 (e.g. on oneor more cards in one or more slots in a rack-mounted configuration).When the radio server 24 and the RHs 36 are located within the same box,they may be referred to as “Node B” or as a basestation.

Alternatively, the RHs 36 may be located in a separate housing from theradio server 24 but connected to an antenna on the top of the cell towerthrough a lossy cable, or mounted at the top of the cell tower 32 nearthe antennae 34, which reduces the connection loss between the RHs andthe antennae. When the RHs 36 are not located in the same box as theradio server 24, they may be referred to as RRHs. When the RRHs arelocated at the top of the cell tower 32, they may be referred to astower-mounted RRHs.

FIG. 3 b illustrates a cluster of cell sites 22 in which there is asingle radio server 24 connected by fiber optic lines 38 in a daisychain or parallel configuration to multiple RRHs 40, each RRH located ata different cell site (although only one RRH is shown at one cell sitefor purposes of simplifying the figure). The RRHs 40 may be located atthe base of the cell tower 32 at each cell site 22 or alternatively atthe top of the cell tower in a tower-mounted configuration.

In either FIG. 3 a or 3 b, the DDPD system 42 according to embodimentsof the present invention provides advantages to the overall cell site'sor cluster of cell sites' performance. This is achieved by predistortingthe digital transmit signal in order to cancel PA nonlinearities as theRF signal is transmitted through the PA to improve the linearization andefficiency of the PA.

As mentioned above, at higher input power levels a power amplifier'scharacteristic curve compresses and becomes non-linear, so that theactual amount of output power is less than what is expected under idealconditions.

FIG. 4 illustrates an exemplary plot of frequency versus output powerfor a multi-carrier signal in which the expected (ideal) output power isrepresented at 48, but at power levels close to saturation, the actualoutput power as a result of output power “distortion” is represented at50. This distortion is undesirable because there may be other devicestransmitting at adjacent frequencies whose communications may bedisrupted if the transmitted signal fails to meet its required adjacentchannel power ratio and spectrum emission mask. The problem will beexacerbated when multi-carrier signals are radiated from thetransmitter.

Embodiments of the invention characterize the distortion by taking theoutput of the PA, feeding it back into the system, comparing it to whatwas expected, and then applying an algorithm to compute “pre-distortion”solution used in the DDPD Engine which results in a PA input signalwhich, when applied to the PA, will serve to minimize the distortion ofthe PA. This “pre-distorted” PA input signal is then fed into the PA sothat the output of the PA will have reduced distortion and generatesomething close to the expected power levels (see reference character 52in FIG. 4). Stated another way, the distortion at the output of the PAcan be modeled as passing the signal to be transmitted through an idealgain, G, followed by a distortion-producing filter, H. The DDPD systemgenerates an inverse distortion or pre-distortion H⁻¹ (a reverse filter)to compensate for H, leaving the PA as an ideal gain.

The DDPD system according to embodiments of the invention generatesmatrices which are used to compute “weights.” The weights, which arethen used to generate the pre-distorted PA input signal, may be storedfor later use whenever a particular power level is detected so that thematrices do not need to be recalculated. In other embodiments, thesematrices may be pre-stored at the factory, and may be used initially andlater changed if the predistortion algorithm needs different matricesfor a different power level. In systems exposed to changingtemperatures, the pre-stored weights will also need to be a function oftemperature.

FIG. 5 is a frequency plot illustrating a simplified example ofpredistortion according to embodiments of the invention. Because thepre-distortion is known to cause the amplitude of the signal to bereduced (a compression effect) at 54, the DDPD Engine will pre-distortthe peaks of the PA input signal to a higher power level at 56. Theresult is a PA output with a power level approximately equal to theexpected power level at 58. Note that there are other non-linear effectsnot compensated for in the example of FIG. 5.

It should be understood that in embodiments of the invention, the PAoutput power levels may be ramped up gradually, so that the extremelydistorted high power output 50 shown in FIG. 4 may never be generated inpractice. In other words, as the PA output power levels are graduallyincreased from a very low level to a certain higher level, the properamount of pre-distortion for that higher power level is determined andapplied. When the PA output power levels are increased again, the newdistortion levels will be relatively small, and once again the properamount of pre-distortion for that new power level is determined andapplied. This process repeats until full output power levels arereached. This ramping up process may occur when a cell site is firstpowered up, so that as users and traffic gradually increase,predistortion matrices for those levels are computed and saved.Alternatively, a fixed ramp-up process (training) may be programmedduring initialization, prior to servicing any users, or the fixedramp-up process may be programmed to occur as the cell site becomesoperational, where gradually increasing power levels will be transmittedto any users currently in communication with that cell site.

The causes of PA distortion can be separated into three major effects.The first is the so-called “memory effect.” If the PA generates a highpower signal followed by a lower power signal, the high power signalleaves a residual or memory effect in the bias circuitry in the form ofcharged up charge storage devices such as capacitors or a fly-wheeleffect in an inductor. These residual effects will change the operatingpoint slightly, resulting in a different transfer function, andcontribute to the distortion in the subsequent lower power signal.Therefore, it would be preferable to compensate for this memory effectwith a characteristic that depends on the past signal.

The second effect is the so-called “thermal effect,” which is a form ofmemory effect. Whenever the PA output power level changes from one levelto another, the thermal effects cause a certain amount of distortion.For example, if the PA output power level rapidly changes from low tohigh, the PA may be cool when it attempts to deliver high power, andthis may contribute to the distortion since the PA will be increasing intemperature. Alternatively, if the PA output power level rapidly changedfrom high to low, the PA may be quite warm when it attempts to deliverlower power, and this may contribute to the distortion as the PA cools.

The third effect includes composite nonlinearities and transitionaldiscontinuities. Highly efficient PAs are usually a composite of twoamplifiers. At low power levels, only a class AB portion of the PA isactivated. At high power levels, the class C section is turned on andcontributes to the overall output power. If the input signal should jumpbetween low and high power levels, a transition between the twoamplification regimes will be experienced, and therefore, thediscontinuity between the two modes creates nonlinearities and makes itmore difficult to linearize the PA output power response. Othernonlinearities are caused by PA amplitude and phase distortion. Oneexample of amplitude distortion is compression. Phase also distorts aspower amplifiers are driven closer to compression.

FIG. 6 illustrates a block diagram of an exemplary DDPD System 42 andother transmit path functional blocks according to embodiments of theinvention. The input signal is a baseband complex signalx₁[n]=I₁[n]+jQ₁[n] having a sampling rate R₁, and a signal bandwidthB₁<R₁.

In FIG. 6, the input signal x₁[n] is processed with the Pre-DDPD SignalConditioner 100 to perform a number of functions. One function is toapply a gain to the signal using the gain scaler. The signal level needsto be set to the correct level so that the signal is not degraded due tolow dynamic range while at the same time avoiding saturation. Gainscaling ensures that we have as high a signal as possible going into theDDPD engine 200 when the input is full power.

Another optional function of the Pre-DDPD Signal Conditioner 100 is toresample the input signal to a sampling rate R_(s) to increase thebandwidth of the signal into the DDPD engine. The Pre-DDPD SignalConditioner 100 may perform upsampling at a fixed sampling rate toenable the signal to be transmitted at a high sample rate (althoughup-sampling can also be performed at a later stage to increase thesample rate). This allows the DDPD engine to create a predistortionspectrum across a wider bandwidth.

Yet another optional function is to capture a section of the sourcesignal and store it in a RAM to be used for AGC and performanceanalysis. Still another function is to detect large amplitude signals inorder to trigger a data capture. This is done to ensure a more completePA characterization which produces a more stable DDPD solution for largeinputs.

The DDPD engine 200 can be thought of as a digital filter that createsthe pre-distortion. The DDPD Engine 200 uses the DDPD coefficients fromthe DDPD coefficient estimator 900 to predistort the signal. Since theDDPD coefficient estimator 900 models the inverse PA characteristicaccurately, the DDPD engine 200 produces a predistorted signal which canprecisely cancel the PA distortion. As a result, the nonlinearity andmemory effects are suppressed which achieves a very good CPL at theoutput of the PA.

Before transmitting the DDPD Engine signal to the RF Transmit UpConverter 400 and PA 500, various signal processing functions areperformed in the Post-DDPD Signal Conditioner 300. The Post-DDPD SignalConditioner 300 performs upsampling, and gain scaling, among otherthings that will be discussed in further detail below.

Another function is the Tx Signal Data Capture. In this function, asection of the DDPD signal is captured and stored in RAMs (can be morethan one) for coefficient computation and signal correlation and AGCpurposes. (AGC processing of a RAM is optional.)

Another function is the Complex Multiplier: The Complex Multiplierperforms Complex Multiplication to adjust gain and phase in casemultiple linearization systems are combined.

Another function is the TX Compensation Filter: The TX CompensationFilter performs filtering to compensate for the Transmit PathDistortion.

Another function is the Digital Upconverter/Upsampler. The DigitalUpconverter/Upsampler upsamples and converts the DDPD baseband signalinto an IF signal before sending the digital IF data to a DAC.Alternatively baseband I and Q data can be sent to two DACs andquadrature upconverted in the analog domain.

In the RF Transmit Up Converter (UC) 400, the IF digital samples areconverted to an analog IF signal using a DAC, which is then RFupconverted to produce the desired RF signal. To assure high performancelinearization, the DAC, IF filtering, RF mixer and RF filtering musthave low distortion.

The RF transmit block 400 then transmits the pre-distorted signal to thePA system 500 for signal amplification to the desired level. Asmentioned above, highly efficient power amplifiers tend to be highlynon-linear and have large memory effects. This is why an effective DDPDengine and DDPD coefficient estimator which corrects for the non-linearand memory effects of such a power amplifier system is required in orderto achieve a highly efficient system.

A complete transmitter may require a RF Duplexer/Filter 600 if transmitand receive signals use the same antenna (but a duplexer is not requiredfor this invention or even related to this invention). A Duplexer is athree-port network that allows the transmitter and receiver to share thesame antenna. The duplexer also filters out some unwanted artifactsfurther out in the transmit spectrum. A system can also use separateantennas and optional separate transmit and receive filters.

The input to the RF feedback system 700 is tapped off of the PA signaloutput, usually using an RF coupler. This signal is then converted to anIF signal, filtered, digitized with a high speed ADC, and finallyprocessed digitally in the DDPD following typical practices known byknowledgeable practitioners in the art. The ADC has a sampling ratelarge enough to capture the significant inter-modulations which need tobe corrected. This RF feedback requires load matching to prevent RF orIF reflections that may add back to the main path causing distortion andthus decreasing linearization performance. Alternatively, the feedbacksignal can be quadrature downconverted to baseband I and Q signals.Furthermore, to correct the RF Feedback gain change over temperature, athermal sensor is placed near the RF Feedback, so that the gain changecan be determined based on the RF Feedback temperature.

The Digital Feedback Processor 800 performs several tasks to conditionthe Feedback signal before it can be used for DDPD coefficientestimation ensuring correct DDPD operation. One of the tasks iscorrecting the down converter gain error. Based on the thermal sensorreading located at the RF Feedback, the Feedback Gain Error Correctordigitally corrects the down converter gain error (the value of the gainerror was calibrated in advanced) by scaling the feedback data. Thisassures that the feedback path gain error is accurate so that the AGCcan accurately correct the gain offset due to the Transmit Upconverterand PA gain change. This gain correction can alternatively be done inthe AGC computation thus eliminating this scaler.

Another task is digital downconversion followed by up-sampling anddown-sampling. The IF To Baseband (IFTBB) Converter is a digitalprocessor that converts the ADC's digital IF signal into a basebandsignal having the same sampling rate as the transmitted data capture andtime aligned with the DDPD signal. This process performs the digitaldownconversion followed by up-sampling and down-sampling. Delay buffersare included between the up sampler and down sampler to enable correcttime correlation. If quadrature downconversion is used, this sectionjust does filtering (if needed), upsampling and downsampling with thedelay buffers.

The Digital Feedback Processor 800 also captures a section of the IFTBBsignal, and stores it in a RAM for signal correlation and AGC purposes.Alternatively, power estimators can be used to determine the signalpower level at certain spots for AGC purposes. Another task performed bythe Digital Feedback Processor 800 is performing the signal alignmentbetween the DDPD signal and FB signal using correlation processing. Inaddition, depending on the calibration process, RF Feedback distortioncan be computed in advance. The inverse distortion can be implemented asan finite impulse response (FIR) or infinite impulse response (IIR)filter to reverse the effects of the RF Feedback distortion, thusimproving the linearization performance.

A TX/FB AGC Processor within the Digital Feedback Processor 800 computesthe ratio of the either the Pre-DDPD signal or post-DDPD signal with theFeedback signal and then compares that ratio to the expected gain todetermine the gain offset. Because the gain offset error due to the RFFeedback may already be corrected if the gain is put in the feedback,the AGC gain offset measured is due to the RF Transmit Up Converter andPA system. This gain error can be corrected at the Post-DDPD SignalConditioner or the RF Transmit Up Converter or both, in such a way thatthe signal entering to the DAC is maintained at the desirable level.

A Feedback Gain Scalar is used to scale the IFTBB data to that of theDDPD data. Alternatively, the gain scaling can be implemented after thecomputation is completed, but in that case it is more complex. Typicallythis scaling gives the DDPD Engine unity gain. Alternatively the DDPDEngine can correct for gain changes like an AGC, but in the best modethat task is separate.

One of the most important components of the DDPD system is the DDPDCoefficient Estimator 900. It is essential that computational accuracyis maintained in order to produce high performance weights. Thecoefficient estimation process is described in detail in other sectionsof this specification.

Note that blocks 100-600 are real-time in that the data is transmittedto the user in real time. Blocks 700-900 are solely to compute newpre-distortion weights, and thus need not be performed in real time, butinstead may be done relatively slowly. For example, every 666microseconds a snapshot of data may be captured in a transmit capturerandom access memory (RAM) and in a feedback capture RAM in thePost-DDPD Signal Conditioner 300 block and the Digital FeedbackProcessor 800, respectively. From a comparison of those snapshots, theDDPD Coefficient Estimator 900 generates pre-distortion filter weightsthat are fed into the DDPD Engine 200. Those weights are written intothe DDPD Engine to update the solution based on the current data captureresulting in a better pre-distorted input signal. The speed at which anew predistortion value is applied is a function of how fast the PAchanges.

Pre-DDPD Signal Conditioner

FIG. 7 illustrates a block diagram of an exemplary Pre-DDPD SignalConditioner 100 according to embodiments of the invention. This blockhas three sub-blocks: Input Signal Resampler 110, Input Gain Scaler 120,and Source Signal Capture RAM 130.

The main objective of the Pre-DDPD Signal Conditioner is to upsample andscale the signal for use by the DDPD Engine. Upsampling takes a datastream, which may have been sampled at 30 Msamples/sec, for example, andfills in the gaps between samples. FIG. 7 shows multiple stages ofupsampling, although it should be understood that any number of stagesmay be used, including one or none. For example, to upsample by eight,three stages of upsampling by two may be used. Also, in this block, datamay be captured for automatic gain control (AGC) and weight verificationpurposes (optional). The input to this block has a CPL at least 5 dBbetter than what is desired at the PA output. It is best if the CPL ofthe input is at least 10 dB better than desired.

Input Signal Resampler. With regard to the Input Signal Resampler block110, the input, x_(in)[n], to the DDPD is typically the source signalthat is processed with a crest factor processor. Generally, the samplingrate of the input signal is slightly higher than the signal bandwidthB_(in). Since the DDPD Engine creates a predistorted signal of x_(in)[n]to correct the PA non-linearity and memory effect, the signal x_(in)[n]must be interpolated to produce a signal, x[n], that has a bandwidthlarger than the bandwidth extent of predistorted signal. The bandwidthextent of the predistorted signal is a function of the inter-modulationorder that is required for correction. Let N_(imd) be the maximum orderof inter-modulation that is desired in the solution. We need tointerpolate x_(in)[n] to a sample rate larger than f_(F)N_(imd) B_(in),where f_(F) is about 1.25 to 1.3 for filter transition. B_(in) isdetermined assuming x_(in)[n] is centered at zero frequency. A similaranalysis can be done if this is not the case. Thus the interpolationfactor is determined as:

$\begin{matrix}{K = \left\lceil {f_{F}\frac{N_{imd}B_{in}}{R_{in}}} \right\rceil} & (1.1)\end{matrix}$

where B_(in) is the input signal bandwidth, R_(in) is the input samplingrate, N_(imd) is the maximum order of inter-modulation that iscorrectable by the DDPD, f_(F) is the over-sampling factor to allowpractical filtering, and ┌x┐ denotes the closest integer value to x thatis larger than x. If K is a non-prime integer, K can be decomposed tomultiple integers: K=K₁K₂K₃ . . . , and the interpolation process is acascading of multiple simple upsampling interpolators of order K₁, K₂,K₃, . . . .

In between the upsampling interpolators, tap delay line buffers D₁, D₂,. . . can be used to provide a variable delay of the transmit signal forbeamforming or other purposes. For example, a 4 carrier UMTS signalwould occupy 20 MHz, and it is desirable to select a input signal samplerate of about R_(in)=8×3.84=30.72 Ms/s. If the DDPD is to correct up to7^(th) order inter-modulation, then the resulting principle sample ratewould be 1.3×20×7=182 Ms/s. Because the input signal, x_(in)[n], has asample rate of 30.72, the interpolation factor is conveniently selectedas 6×30.72=184.32 Ms/s. Because the interpolation factor of 6 can bedecomposed as 6=2×3, the interpolation of 6 can be obtained by cascadingtwo interpolation filters of order 2 and 3.

Input Gain Scaler.

With regard to the Input Gain Scaler block 120, the DDPD processor hasfinite dynamic range that is determined by the bit width. If the signallevel is too high, the signal is likely to be saturated. If the signallevel is too low, the signal will lose its precision. In both cases, theperformance would be degraded. This scaler should be set so that thelargest expected peak signal into the DDPD Engine is just belowsaturation. During operation, this scalar should not be changed, or ifnecessary, changed very slowly so as not to degrade performance.Alternatively, this scalar can go before the Input Signal Resampler.

Source Signal Capture RAM.

With regard to the Source Signal Capture RAM 130, N_(agc) samples of theinput to the DDPD engine are captured to be used in the AGC processor orperformance analysis. This capture RAM is optional; AGC may be performedusing a real-time level estimator instead, and this RAM is optional forperformance analysis.

DDPD Engine

FIGS. 8 a and 8 b illustrate an exemplary DDPD Engine 200 according toembodiments of the invention. The purpose of this block is to predistortthe input signal in order to cancel the PA nonlinearities as the signalis transmitted through the PA. In effect, the DDPD engine produces theinverse of the PA characteristics. The DDPD engine applies the DDPDcoefficients to the transmitted signal to predistort the signal andcancel the PA distortion. The effect is that the baseband equivalent ofthe output of the PA is very close to being the same as the inputsignal.

The DDPD has two effects on the PA output. First, since the DDPDpredistorts the signal to cancel the distortion of the PA, theout-of-band intermodulation distortion observed at the PA output isreduced. Because the input signal x[n] has very low out-of-band CPL, theoutput of the PA would also have very low out-of-band CPL. Second, thedistortion corrected by the DDPD includes the in-band intermodulationdistortion of the PA. This improves the error vector magnitude (EVM)between the PA output and the source signal.

Linear Digital Filter.

The input signal 210 passes through a linear digital filter 214. Thelinear digital filter 214 has one or more linear digital filter taps,each tap other than the first tap being successively delayed by onedelay unit. It should be understood that although FIG. 8 a shows ageneral case of multiple high order term filter blocks 218 adding intomultiple taps of the linear digital filter 214, in actualimplementations only a subset of the high order term filter blocks 218may actually be employed. A powers generator circuit 216 generates oneor more powers (b values) of the transmit signal 210, where the b valuesrepresent powers of x[n]. Each high order term filter block 218 mayreceive a different set of one or more powers of the transmit signal 210from the powers generator block 216.

High Order Term Filters.

One high order term filter block 218 is shown in FIG. 8 b. The highorder term filter block 218 includes one or more linear digital filters212, each linear digital filter 212 having one or more taps, each tapother than the first tap being successively delayed by one delay unit.Each linear digital filter 212 may receive as an input 220 any one ormore of the powers of the transmit signal from the powers generatorblock shown in FIG. 8 a, delayed by a different amount as represented bydelay blocks 222. The bold lines in FIG. 8 b represent complex numbers,while the non-bold lines represent real numbers. The output of thelinear digital filters 212 in each high order term filter block 218 areall added together at 224 and added to the tap in the correspondinglinear digital filter.

Each linear digital filter in FIGS. 8 a and 8 b has programmable complexcoefficients or weights w which are provided by the DDPD CoefficientEstimator, to be discussed in detail hereinafter. The weights wrepresent a pre-distortion value computed based on a comparison of thepre-distorted transmit signal y[n] and a feedback signal z[n] derivedfrom the output of the distorting element.

Denoting y[n] as the DDPD engine output, this implementation can bemathematically expressed as

$\begin{matrix}{{y\lbrack n\rbrack} = {D\; P\; {D\left( {x\lbrack n\rbrack} \right)}}} & (2.1) \\{{\lbrack n\rbrack = {{w_{11}{x\lbrack n\rbrack}} + {w_{12}{x\left\lbrack {n - 1} \right\rbrack}} + \ldots + {w_{1,d}{x\left\lbrack {n - d + 1} \right\rbrack}} + {{x\lbrack n\rbrack}\begin{pmatrix}{{w_{1,21}{b^{k_{12} - 1}\left\lbrack {n - \lambda_{12}} \right\rbrack}} + {w_{1,22}{b^{k_{12} - 1}\left\lbrack {n - \lambda_{12} - 1} \right\rbrack}} + \ldots + {w_{1,{2q_{12}}}{b^{k_{12} - 1}\left\lbrack {n - \lambda_{12} - q_{12} + 1} \right\rbrack}} +} \\{{w_{1,31}{b^{k_{13} - 1}\left\lbrack {n - \lambda_{13}} \right\rbrack}} + {w_{1,32}{b^{k_{13} - 1}\left\lbrack {n - \lambda_{13} - 1} \right\rbrack}} + \ldots + {w_{1,{3q_{13}}}{b^{k_{13} - 1}\left\lbrack {n - \lambda_{13} - q_{13} + 1} \right\rbrack}} +} \\\ldots \\{{w_{1,{N_{1}1}}{b^{k_{{1N_{1}} - 1}}\left\lbrack {n - \lambda_{1N_{1}}} \right\rbrack}} + {w_{1,{N_{1}2}}{b^{k_{1N_{1}} - 1}\left\lbrack {n - \lambda_{1N_{1}} - 1} \right\rbrack}} + \ldots + {b^{k_{1N_{1}} - 1}\left\lbrack {n - \lambda_{1N_{1}} - q_{1N_{1}} + 1} \right\rbrack}}\end{pmatrix}} + \ldots}}{{{x\left\lbrack {n - r} \right\rbrack}\begin{pmatrix}{{w_{r,21}{b^{k_{r\; 2} - 1}\left\lbrack {n - \lambda_{r\; 2}} \right\rbrack}} + {w_{r,22}{b^{k_{r\; 2} - 1}\left\lbrack {n - \lambda_{r\; 2} - 1} \right\rbrack}} + \ldots + {w_{r,{2q_{r\; 2}}}{b^{k_{r\; 2} - 1}\left\lbrack {n - \lambda_{r\; 2} - q_{r\; 2} + 1} \right\rbrack}} +} \\{{w_{r,31}{b^{k_{r\; 3} - 1}\left\lbrack {n - \lambda_{r\; 3}} \right\rbrack}} + {w_{r,32}{b^{k_{r\; 3} - 1}\left\lbrack {n - \lambda_{r\; 3} - 1} \right\rbrack}} + \ldots + {w_{r,{3q_{r\; 3}}}{b^{k_{r\mspace{11mu} 3} - 1}\left\lbrack {n - \lambda_{r\; 3} - q_{r\; 3} + 1} \right\rbrack}} +} \\\ldots \\{{w_{r,N,1}{b^{k_{{rN}_{r}} - 1}\left\lbrack {n - \lambda_{{rN}_{r}}} \right\rbrack}} + {w_{r,{N_{r}2}}{b^{k_{{rN}_{r}} - 1}\left\lbrack {n - \lambda_{{rN}_{r}} - 1} \right\rbrack}} + \ldots + {w_{r,{N_{r}q_{{rN}_{r}}}}{b^{k_{{rN}_{r}} - 1}\left\lbrack {n - \lambda_{{rN}_{r}} - q_{{rN}_{r}} + 1} \right\rbrack}}}\end{pmatrix}} + \ldots}{{x\left\lbrack {n - d} \right\rbrack}\begin{pmatrix}{{w_{d,21}{b^{k_{d\; 2} - 1}\left\lbrack {n - \lambda_{d\; 2}} \right\rbrack}} + {w_{d,22}{b^{k_{d\; 2} - 1}\left\lbrack {n - \lambda_{d\; 2} - 1} \right\rbrack}} + \ldots + {w_{d,{2q_{d\; 2}}}{b^{k_{d\; 2} - 1}\left\lbrack {n - \lambda_{d\; 2} - q_{d\; 2} + 1} \right\rbrack}} +} \\{{w_{d,31}{b^{k_{d\; 3} - 1}\left\lbrack {n - \lambda_{d\; 3}} \right\rbrack}} + {w_{d,32}{b^{k_{d\; 3} - 1}\left\lbrack {n - \lambda_{d\; 3} - 1} \right\rbrack}} + \ldots + {w_{d,{3q_{d\; 3}}}{b^{k_{d\; 3} - 1}\left\lbrack {n - \lambda_{d\; 3} - q_{d\; 3} + 1} \right\rbrack}} +} \\\ldots \\{{w_{d,{N_{d}1}}{b^{k_{{dN}_{d}} - 1}\left\lbrack {n - \lambda_{{dN}_{d}}} \right\rbrack}} + {w_{d,{N_{d}2}}{b^{k_{{dn}_{d}} - 1}\left\lbrack {n - \lambda_{{dN}_{d}} - 1} \right\rbrack}} + \ldots + {w_{d,{N_{d}q_{{dn}_{d}}}}{b^{k_{{dN}_{d}} - 1}\left\lbrack {n - \lambda_{{dN}_{d}} - q_{{dN}_{d}} + 1} \right\rbrack}}}\end{pmatrix}}} & (2.2)\end{matrix}$

Written compactly,

$\begin{matrix}{{\underset{\_}{y}\lbrack n\rbrack} = {\left\{ {\sum\limits_{p = 1}^{d}{\sum\limits_{r = 2}^{N_{p}}{\sum\limits_{s = 1}^{q_{pr}}{{\underset{\_}{w}}_{q,{rs}}{{x\left\lbrack {n - \lambda_{pr} - s + 1} \right\rbrack}}^{k_{pr} - 1}{x\left\lbrack {n - p + 1} \right\rbrack}}}}} \right\} + {\sum\limits_{p = 1}^{d}{{\underset{\_}{w}}_{1p}{x\left\lbrack {n - p + 1} \right\rbrack}}}}} & (2.3)\end{matrix}$

where x[n] is the DDPD input signal, b[n] is the amplitude of the input(i.e., b[n]=|x[n]|), y[n] is the DDPD Engine output, k_(ri) and q_(ri)are the power and the length of filter of the i^(th) high order termfilter of the r^(th) higher order term block (e.g., k₃₂ and q₃₂ are thepolynomial power and the length of second filter of the third higherorder term block, respectively). The terms k_(ri) represent realnumbers, d is the number of taps of the baseband filter, λ_(ri) is thenumber of delays of the i^(th) high order term filter of the r^(th)higher order term block, w_(1j) is the coefficient for the j^(th) tapfor the baseband filter, and w_(r,ij) is the coefficient for the j^(th)tap for the i^(th) high order term filter of the r^(th) higher orderterm block.

The top line of equation 2.2 represents the linear digital filter 214 inFIG. 8 a. A digital filter is usually implemented with tapped delays ofthe original input signal x, and each delay is weighted with a differentweight w, which is received from the DDPD Coefficient Estimator, to bediscussed in greater detail hereinafter. The first term is the weightedoriginal undelayed input signal x[n]. The other terms, which representthe original input signal delayed by various amounts, are needed becausethe DDPD engine needs to model the memory effects in the PA as describedabove, and generate a DDPD engine output that accounts for these memoryeffects by inverting the PA's characteristics.

Because the PA has high order nonlinearities, as discussed above, thosehigh order nonlinearities need to be reproduced by having powers of theinput signal x[n] in the filter solution of equation 2.2 (where the bvalues generated from the Powers Generator block 216 represent powers ofx[n]). High order term filters capable of representing these high ordernonlinearities are represented in the bottom terms of equation 2.2,which add to the baseband taps, x[n−J−1], 0<J<d−1 in the linear digitalfilter 214. In the bottom terms of equation 2.2, within the largeparentheses, each line is a nonlinear digital filter representingdifferent orders of x[n] (e.g. x[n]̂3, x[n]̂4, etc.) with various delayedtaps. These are the linear digital filters 212 shown in FIG. 8 b. Theoutput of the filters are nonlinear because they receive as input apower of x[n] (the input signal) rather than x[n] itself. Otherwise,structurally the filters are similar. Each term in each line isessentially a different filter itself.

In the exemplary DDPD Engine shown in FIGS. 8 a and 8 b, the “T” boxesmean a one clock delay. In FIG. 8 a, the “Powers Generator” box computesb[n]=|x[n]| and raises this real value to different powers,

{b ^(k) ^(r2) ⁻¹ [n], b ^(k) ^(r3) ⁻¹ [n], b ^(k) ^(rN) ⁻¹ [n]}.  (2.5)

Note that the computation can be done using any method as long as theterms are mathematically equal to these b's. The b^(k) ^(ri) ⁻¹[n] termsare fed through q_(ri) tap filters having the coefficients {w_(r,i) ₁ ,w_(r,i) ₂ , . . . , w_(r,iq) _(ri) }. The output of these filters aresummed with the coefficient w_(1r). The coefficients of data-path filterare then applied as the filter coefficients to the signal x[n] toproduce the DDPD signal y[n].

By adding more structure onto each baseband tap coefficient, this moregeneric structure extends the structure in “A Robust Digital BasebandPredistorter Constructed Using Memory Polynomials,” by Lei Ding, G. TongZhou, Zhengriang Ma, Dennis R. Morgan, J. Stevenson Kenney, JaehyeongKim, and Charles R. Giardina (manuscript submitted to IEEE Trans. OnCommunication, Mar. 16, 2002), the contents of which are incorporatedherein by reference.

Post-DDPD Signal Conditioner

FIG. 9 illustrates an exemplary Post-DDPD Signal Conditioner 300according to embodiments of the invention. The objective of thePost-DDPD Signal Conditioner 300 is to condition the signal to the formthat can be easily RF upconverted and amplified.

For multiple antenna transmission, it is often required that the signalamplitude and phase be adjusted to implement beam-steering. Thisfunction is provided by the use of the programmable Complex Multiplier330. The Transmit Signal Gain 340 (which effectively can be part of theComplex Multiplier) can be controlled by the AGC circuitry (or anyprocessor) to place the signal into the DAC at the correct level. Aprogrammable complex FIR filter called the Transmit Path DistortionCompensator 350 is implemented so that after calibration, this filterwill correct the linear distortion due to DAC and Transmit Upconverter.Next is the Digital Baseband To IF Upconverter 360 which performsfiltering, upsampling, fine time delay, and fs/4 frequency translationto place the DDPD signal at the correct IF frequency for transmission.

The five sub-blocks of the Post-DDPD Signal Conditioner, represented bythe DDPD Signal Capture RAM, Complex Multiplier, Transmit Signal Gain,Transmit Path Distortion Compensator, and Digital Baseband To IFUpconverter blocks will now be examined in detail.

DDPD Signal Capture RAMs.

The output of the Transmit Limiter is the baseband version of the PAinput signal. This signal is tapped off and sent to the DDPD SignalCapture RAMs 320, one for the correlation and one for coefficientestimation although one RAM could be shared. The coefficient RAM canalso be used for AGC processing, but alternatively a real-time levelestimator can be used. The signal to the correlation RAM is captured asis. The signal to the coefficient estimation RAM is delayed by N_(tx)_(—) _(delay) samples, and L samples of this complex signaly[n]=(I_(DPD)[n]+jQ_(DPD)[n]) are captured. If the A and B matrixcomputation (discussed later) takes place at the data rate speed, nodata capture is required. The data goes straight into the weightcomputation engine.

Complex Multiplier.

It is desirable to be able to vary the amplitude and phase of the postDDPD Engine signal. To support this function, a complex multiplier 330is used to provide this gain adjustment.

I _(out) [n]=CM_(i) *I _(in) [n]−CM_(q) *Q _(in) [n]  (3.4)

Q _(out) [n]=CM_(q) *I _(in) [n]+CM_(i) *Q _(in) [n]  (3.5)

where I_(in)[n], Q_(in) [n] and I_(out)[n], Q_(out)[n] are the input andoutput of the Complex Multiplier, respectively, and CM_(in) and CM_(q)are the real and complex components of the multiplier. If we write(CM_(i)+j CM_(q))=Ae^(jθ), then it is apparent that CM_(i) and CM_(q)implement an amplitude and phase change.

Transmit Signal Gain.

During the operation, the TX/FB AGC may adjust the gain of the signalthat is sent to the DAC in concert with adjusting an analog gain, g₂,discussed later, in the Transmit Signal Gain block 340.

I _(out) [n]=g ₁ I _(in) [n]  (3.6)

Q _(out) [n]=g ₁ Q _(in) [n]  (3.7)

Where I_(in)[n], Q_(in)[n] and I_(out)[n], Q_(out)[n] are the input andoutput of the Transmit Signal Gain, respectively, and g₁ is the gainvalue that is provided from the AGC or other processor. This gain may beincorporated into the complex multiplier.

Transmit Path Distortion Compensator (Equalizer).

The DAC and the Transmit RF Upconverter have amplitude and/or phasedistortion that may need to be corrected using the Transmit PathDistortion Compensator (Equalizer) 350 to optimize the linearizationperformance. Referenced to the baseband domain, the transfer functionH_(DAC)(f) of the DAC generally has a sin(x)/x filter shape:

$\begin{matrix}{{{{{{H_{DAC}(f)} = \frac{\sin\left( {{\pi \left( {f - f_{IF}} \right)}T_{s}} \right.}{{\pi \left( {f - f_{IF}} \right)}T_{s}}};} - \frac{B_{imd}}{2}} < f < \frac{B_{imd}}{2}},} & (3.8)\end{matrix}$

where f_(IF) is the IF center frequency of the input to the DAC inputsignal, 1/T_(s) is the DAC conversion rate, and B_(imd) is the bandwidthof the transmit signal including the highest inter-modulationdistortions that are required to be corrected.

The baseband version of the transfer function H_(UC)(f) of the TransmitRF Upconverter can be measured. Thus the transfer function for therequired correction filter is expressed as:

$\begin{matrix}{{{{{H_{TXC}(f)} = {\frac{{\pi \left( {f - f_{IF}} \right)}T_{s}}{\sin\left( {{\pi \left( {f - f_{IF}} \right)}T_{s}} \right.}\frac{1}{H_{UC}(f)}}};} - \frac{B_{imd}}{2}} < f < \frac{B_{imd}}{2}} & (3.9)\end{matrix}$

This filter is then implemented as a complex digital FIR filter or IIRfilter with complex coefficients using techniques known to theknowledgeable practitioner in the art. It is preferred that a FIR filterbe used in order to maintain the signal stability and retain linearphase.

Digital Baseband to IF Upconverter.

The baseband transmit signal, y[n], has a sampling rate R_(s), andoccupies a bandwidth of less than R_(s). This signal needs to beconverted to an analog signal and up-converted to the desired RFfrequency in the Digital Baseband to IF Upconverter 360 before sendingto the power amplifier. There are two approaches for Baseband to RFmodulation.

In Analog Direct conversion, the baseband I and Q signals are separatelyconverted to baseband analog signals. The signals are then filtered withlow-pass filters and the in-phase and quadrature components arequadrature modulated to the desired RF frequency. Major limitations ofthis technique are amplitude imbalance and phase offset between the twoDACs and their following baseband filters, and errors in the phase oflocal oscillators (LO's) utilized in the quadrature modulator. Theamplitude and phase mismatch between the analog in-phase and quadraturesignals prevents the quadrature modulator from completely suppressingthe signal images. In order to suppress the signal images to the desiredlevel, the in-phase and quadrature DAC's and the associated low-passfilters must be fine-tuned, which can be impractical for widebandtransmission. An alternative is to use a compensation filter in thedigital domain, but this requires calibration and additional hardwarecomplexity.

A more effective method for signal conversion is to perform IFup-conversion in the digital domain (IF sampling). This methodalleviates the amplitude and phase mismatch and allows for much easiersuppression of the signal images. In general, upsampling and mixing isrequired in this block. Upsampling can be done with a number of knownmethods found in textbooks. Typically an upsampling by a factor of fourresults in a desirable sampling rate and bandwidth into a DAC. It may benecessary to do some of the upsampling using parallel datapaths in orderto keep the data rate within practical limits based on the ASICtechnology used to implement the circuit. Before mixing, the data can bedelayed at the higher sample rate in order to provide more delayresolution.

FIG. 10 illustrates an implementation of an exemplary Fine Delaystructure which can apply a one clock shift (at the actual data rate) tofour parallel data paths according to embodiments of the invention. Thefine delay structure is part of the Digital Baseband to IF Up-Converter360. The fine delay structure enables the aligning of the timing of twodevices such as two transmitters to a fine resolution of one clockperiod of a data path having a particular data rate. For transmitsignals with high data rates, it will probably be necessary to useparallel paths after the digital baseband to IF converter block. Forexample, given a data rate of 720 Megasamples per second (MSPS), thedata can be split into four paths, each now at 180 MSPS. To apply adelay equal to one 720 MSPS clock on the data path, a structure such asthat shown in FIG. 10 will have to be implemented. By combining theprogrammable one slow clock delay with the crossbar switch, the delaysat the faster clock rate can be implemented. The T's represent one clockdelay at the 180 MSPS clock rate. The table under the block diagramshows how to program the muxes in order to apply a delay of 0-3 fastclocks.

To explain the operation of the exemplary fine delay structure shown inFIG. 10, suppose that a data path receives data samples X(6), X(5),X(4), X(3), X(2), X(1), X(0), with X(0) being the first sample receivedand X(6) being the last sample received. In this example, the datastream is 720 MSPS, and the chip cannot sustain this data rate, so acommutator is used to split it into four separate parallel data paths asshown in FIG. 10. The samples are essentially time division multiplexedso that successive data samples are routed to the parallel data paths ina round-robin fashion. For example, X(0) goes to path 1, X(1) goes topath 2, X(2) goes to path 3, X(3) goes to path 4, X(4) goes back to path1, and so forth. Three of the paths can be delayed by one-fourth of thedata rate of 720 MSPS (see delay T in FIG. 10), or not delayed at all,by selection of a multiplexer (MUX) 316 in each path. The crossbar 318can further be programmed as shown in the table of FIG. 10 to route thefour data paths to the four output paths. The four parallel outputs canthen be re-combined and serialized back into a single line at a rate of720 MSPS using a commutator. By appropriate configuration of themultiplexers 316 and crossbar 318, an effective delay of either 0, 1, 2or 3 times the data rate of 720 MSPS is created.

RF Transmit Up Converter

FIG. 11 illustrates exemplary components of the RF Transmit Up Converterblock 400 according to embodiments of the invention. The purpose of thisblock is to convert the digital IF samples (assuming analog directconversion is not done) of the upsampled and mixed DDPD Engine outputinto an RF signal at the desired power level for input into the PA. TheIF digital samples are converted to an analog IF signal via a high speedDAC, and are then RF upconverted to produce the desired RF signal. Thisoutput signal is then amplified using a power amplifier as described inthe PA block.

DAC.

The output of the DDPD is a predistorted digital signal which may havebeen digitally upsampled and upconverted before being sent to the DAC410. This signal is converted to an analog IF signal using a high speedand high dynamic range DAC. To assure no saturation in data transmittedto the DAC, the Pre and Post DDPD scalers must be set such that the DACRMS level is below full scale minus the maximum peak to average ratio ofthe input to the DAC (which includes the additional peaking created dueto the DDPD Engine). There should be some margin in this setting toallow for unexpected peaks.

The DAC is operated at the input sample rate and must have an outputnoise floor lower than the required spectral mask emissionspecification. The required DAC noise density can be expressed asfollows:

SNR_(DAC,dBFS/Bs)>MPAR_(TX,dB)+10*log 10(N_(c))+CPL_(dB/Bs)+Margin_(SEM,dB)  (4.1)

where SNR_(DAc, dBFS/Bs) is the signal to noise ratio of the DAC (indB/Bs), Bs is the carrier bandwidth, MPAR_(TX,dB) is the projectedmaximum peak to average ratio of the transmit signal when transmittingat full peak power, N_(c) is the number of carriers, CPL_(dB/Bs) is thecarrier power leakage ratio (in dB with respect to one carrier) andMargin_(SEM,dB) is the margin (dB) to assure that the DAC noise is lowenough so that when added to all the other system noise, the combinednoise meets the required SEM level.

IF Filter/Amplifier.

The Transmit Upconverter 400 performs IF filtering in the IFFilter/Amplifier block 420 to remove the DAC clock noise and IF image.This filter must have low amplitude and phase distortion. The signal isthe amplified to the desired level and then sent to the RF mixing block.

RF Mixer.

Following the IF filter, the Transmit Upconverter 400 performs RF mixingof the IF signal to the desired frequency in the RF Mixing block 430.This RF mixing process must have low distortion (high IIP3) and lowphase noise to minimize its effects on linearization and spectralemissions.

RF Filtering and Amplification.

The RF filtering and Amplification block 440 performs RF filtering toreject the IF image. This filter must have low amplitude and phasedistortion.

RF Gain Adjust.

During operation, the TX/FB AGC may adjust the gain g₂ of the signalthat is sent to the Power Amplifier in the Gain Adjustment block 450.The RF signal can be increased or decreased in concert with the TransmitSignal Gain, g₁, to maintain constant gain for the AGC loop. The outputof the Gain Adjustment block goes to the Power Amplifier.

Power Amplifier System

FIG. 12 illustrates an exemplary Power Amplifier system 500 according toembodiments of the invention. The signal at the output of the TransmitUpconverter is typically low power, and requires a very large gain,G_(PA), to bring the signal to the desired transmit level. It is typicalto divide the Power Amplifier into three or more amplifier stages toproduce a large gain (although fewer stages are also possible):

G _(PA,dB) =G _(PA1,dB) +G _(PA2,dB) +G _(PA3,dB)  (5.1)

where G_(PA1,dB), G_(PA2,dB) and G_(PA3,dB) are the gains of the PreAmplifier, Drive Amplifier and High Power Amplifier, respectively.

Pre-Amplifier.

Pre-Amplifier block 510 is a very linear PA with gain G_(PA1,dB). Ittakes an ultra low input power signal and produces a low power levelsignal.

Driver Amplifier.

Driver Amplifier block 520 is also a linear PA but with gain G_(PA2,dB).It takes a low signal power level and produces an intermediate powerlevel signal.

High Power Amplifier.

To maximize the PA efficiency, this PA 530 can be a non-linear PA whichcan be Class AB or any configuration such that the conduction angle isset by the quiescent point. Alternatively a composite structure such asa Doherty PA can be used. A Doherty PA is composed of combination of aminimum of two active amplifying devices where the conduction angles ofeach are controlled separately. In common structures the main amplifieris set at class AB and the peaking (auxiliary) amplifier is set at classC. Other variations of Doherty configurations are also possible andapplicable where the gates and drain voltages are controlled to enhancethe efficiency and the linearity of the amplifier circuit. Theefficiency improvement in a Doherty amplifier stems from operating themain amplifier in saturation in the PA back off region. The class Camplifier section will turn on at a higher level and hence the amplifierwill remain in saturation region for a larger part of the signaldynamics. Amplifiers operated in saturation mode are highly efficientbut the efficiency and linearity are mutually exclusive. Therefore, theDoherty amplifier and indeed any other high efficiency technique willlead to highly nonlinear behavior. Another approach to achieve highefficiency in the PA chain is a technique known as envelope tracking.Yet another technique is switching amplifiers that offer the highestefficiencies. These are class D, E, F, and hybrid combinations with theDoherty pair. The nature of non-linearity in an amplifier is threefold:Static nonlinearity, linear and non-linear dynamics otherwise known asmemory effects. Since linear amplification is a system requirement, themain DDPD objective is to pre-distort the signal to cancel thedistortion in order to produce a signal that has low distortion, highfidelity, and low CPL.

Transmit Coupler for Feedback.

A signal splitting mechanism such as a directional coupler 540 is placedat the PA output which is used to tap off the PA output signal back tothe feedback path for DDPD coefficient estimation and AGC control. Themain path of the coupler goes to the Duplexer (if one is used).

RF Duplexer/Filter

The signal after the Transmit Coupler is sent to the RF Duplexer 600 insystems where both receive and transmit use the same antenna. Theduplexer allows transmit and receive paths, which are in differentfrequency bands, to use the same antenna. It is a 3-port device with onebi-directional port, an input port and an output port. The input andoutput ports operate on different frequencies. The Duplexer includesfilters to separate these frequencies, preventing crosstalk. Thus theDuplexer both acts as an RF filter which removes the out of bandinter-modulation distortion so that this signal will not causeinterference to spectrum, and also minimizes the noise to the UplinkReceiver that may be co-located with the Transmitter. In systems withseparate antennas for transmit and receive, or TDD systems, no duplexeris required. Instead, a filter is used to increase out-of-bandemissions.

RF Feedback

FIG. 13 illustrates an exemplary RF Feedback block 700 according toembodiments of the invention. The purpose of this block is todownconvert and digitize the PA feedback signal. The TXPA output isfirst downconverted into an IF signal, and then converted into digitalsamples using an ADC.

RF Temperature Sensor.

A temperature sensor 710 is placed at the vicinity of the PA and the RFFeedback Downconverter so that the PA characteristics can be monitoredand maintained. The gain error of the Feedback Downconverter versustemperature should be characterized so that it can be corrected by theTX/FB AGC Processor. The temperature provides a digital signal that issent to the TX/FB AGC Processor.

DC Gain Adjustment.

The DC Gain Adjustment block g₃ 720 is used to ensure that thedownconverted signal has sufficient dynamic range into the ADC. Thisgain can be controlled independently of the AGC loop which controls theoutput power.

RF Mixing.

Following the Gain Adjustment, g₃, the Feedback Downconverter performsRF mixing 730 of the IF signal to the desired IF frequency. The IFfrequency is preferably at 3R_(ADC)/4 or R_(ADC)/4, where R_(ADC) is theADC sampling rate, so that the ADC can capture the signal digitally withmaximum bandwidth. The L.O. reference for the feedback mixer is the sameas the TX upconverter mixer so that the feedback signal is coherent withthe transmit signal. Alternatively, quadrature downconversion can beemployed. This RF mixing process must have low distortion (high IIP3)and low phase noise to minimize its effect on linearization.

IF Filtering and Amplification.

The Amplification and IF filtering block 740 performs signalamplification and IF anti-aliasing filtering. This filter removes theout of band signal so that the output signal can be effectively capturedwith the ADC.

Feedback ADC.

The output of the IF filter is an analog IF signal centered preferablyat ¾R_(ADC). This signal is converted to a digital IF signal using ahigh speed and high dynamic range ADC 750. To assure no saturation onthe ADC, the Gain Adjustment, g₃ is set such that the largest rms signallevel into the ADC is at the desired level. This level is:

RMS _(ADC,dB) =ADC _(full) _(—)_(scale)−MPAR_(SIG,dB)−Margin_(ADC,dB)  (7.1)

where MPAR_(SIG,dB) is the maximum peak to average ratio of the signalinto the ADC and Margin_(ADC,dB) is the extra margin to account forunexpected signal increases or temperature effects. The ADC must have anoise floor so that the digitized signal has sufficient resolutionaccuracy for the later Coefficient Estimation computation.

Digital Feedback Processor

FIG. 14 illustrates an exemplary digital feedback processor 800according to embodiments of the invention. The purpose of this block isto compute the coefficients of the inverse filter of the Transmit & PAblock so that the difference between the digitized baseband version ofthe PA output, z[n], and the input to the DDPD Engine, x[n], isminimized. The input to FIG. 14 is from the ADC in the RF feedback block700. As shown in FIG. 14, this block first implements an optionalFeedback Gain Error Corrector 810 to correct for the downconverter gainerror (can be done in AGC algorithm instead).

Next, the signal passes through the IF To Baseband (IFTBB) Converter 820which converts the digital IF signal to a baseband IQ signal. Thissignal is passed through a Feedback compensation filter to equalize theRF downconverter filtering imperfections. The signal is then upsampledto a rate suitable for correlation. The DDPD/FB Correlator 850 capturesboth transmit data and the upsampled feedback data and performs signalcorrelation to align the baseband feedback with the baseband DDPD Engineoutput signal. This upsampled feedback signal is then downsampled withthe phase selected by the correlation processor.

The signals captured in the capture RAMs 840 and 320 are used for DDPDCoefficient computation and can be used for AGC (or can use real-timelevel estimators). Note that instead of using the capture RAMs 840 and320, real-time level estimators that estimate the level of the signalsmay be used for AGC.

It is desirable to maintain the gain through the DDPD engine constantand preferably at unity. The Feedback Gain Scaler 880 is used to adjustthe signal level of the feedback to the same, or some other fixed levelrelative to the transmit signal. The adjustment comes from the AGCProcessor which uses power estimates (or similar estimates) of thesignal to determine the scale factor. Finally, to optimize the DDPDcoefficient solution, the feedback samples are delayed in Sample Shiftblock 890 to place the DDPD filter coefficients to the optimum position.

Feedback Gain Error Corrector.

The gain of the RF Feedback downconverter may change over temperature.This change needs to be corrected to assure proper AGC computation. TheFeedback Gain Error Corrector 810 performs the gain error correction,g₄, that is provided by the TX/FB AGC Processor.

FIG. 15 illustrates that g₄ is applied at multiplier 836 to maintainconstant Feedback Gain according to embodiments of the invention. It isnoted that the feedback gain error corrector 810 is optional, and can beapplied at any later stage of processing or even in software as long asthe feedback gain error is accounted for in AGC computations.

IF to Baseband Converter.

The objective of the IF To Baseband Converter (IFTBB) 820 is to convertthe ADC signal into a resampled baseband signal that is aligned with thetransmit signal. In particular, block 820 performs mixing to baseband,filtering, re-sampling, and introduces delay so that the feedback signalclosely matches the timing of the transmit signal.

FIG. 16 illustrates an exemplary IFTBB processor 820 that converts theIF signal to a resampled baseband signal according to embodiments of theinvention. As shown inside the dotted line of FIG. 16, this processormay be implemented in hardware that has four main blocks: the NCO(numerically controlled oscillator) 821, baseband filter 822, resampler823, and signal delay 824. The design also includes the Delay Buffersthat are used to align the baseband feedback signal with the transmitsignal.

NCO.

The NCO 821 mixes the ADC signal y_(ADC)[n] with digital LO signale^(j2πf) ^(LO) ^(n/R) ^(ADC) =cos(2πf_(LO)n/R_(ADC))+jsin(2πf_(Lo)n/R_(ADC)) to produce the in-phase component I_(FB1)[n] andquadrature component Q_(FB1)[n]. Thus,

I _(FB1) [n]=y _(ADC) [n]cos(2πf _(LO) n/R _(ADC))  (8.1)

Q _(FB1) [n]=y _(ADC) [n] sin(2πLO ^(n/R) _(ADC))  (8.2)

If the NCO frequency f_(LO)/R_(ADC) is −R_(ADC)/4, thencos(2πf_(LO)n/R_(ADC)) and sin(2πf_(LO)n/R_(ADC)) have the values of −10 1 0 −1, and the multipliers in equations (8.1) and (8.2) aresimplified.

Baseband Filters.

The signals, I_(FB1)[n] and Q_(FB1)[n], are then low pass filtered inbaseband filters 822 to reject the signal image:

I _(FB2) [n]=LPF(I _(FB1) [n])  (8.3)

Q _(FB2) [n]=LPF(Q _(FB1) [n])  (8.4)

This filter should be designed to reject the image down to below the ADCnoise floor.

Resampler.

The signals, I_(FB2)[n] and Q_(FB2)[n], have a sampling rate of R_(ADC).In order to do the weight computation, the feedback data sample ratemust be the same as the transmit data sample rate, R_(s). LetM_(res)/N_(res) be the ratio of R_(ADC)/R_(s) where M_(res) and N_(res)are integers. I_(FB2)[n] and Q_(FB2)[n] can be upsampled by a factor ofN_(res) and later downsampled by a factor of M_(res). The upsampling insampler 823 can be done by any method known to a skilled practitioner inthe art. The upsampled signals are I_(FB3)[n] and Q_(FB3)[n]. Finally,downsampling is straightforward (take one of every M_(res) samples)resulting in the signals I_(FB4)[n] and Q_(FB4)[n] having the samplingrate of R_(s).

Delay Buffers.

Two Delay Buffers 824 are used to facilitate signal alignment betweenthe IFTBB signal, z_(FB3)[n]=I_(FB3)[n]+j Q_(FB3)[n], and the TransmitLimiter output, y_(TL)[n]=I_(TL)[n]+j Q_(TL)[n]. As shown in FIG. 16,Delay Buffer #1 is a fine delay with M_(res) tap delay register locatedbefore the down sampling. This delay buffer facilitates fine timeshifting by choosing the phase of the downsampler. It has resolution1/M_(res)R_(s). Delay Buffer #2 is a coarse delay with LL_(max) tapdelay line that facilitates coarse time sample shifting with resolution1/R_(s).

Note that the ADC data is delayed with respect to the DDPD data since itmust be transmitted and then fed back. In order to simplify theexplanation of the invention, a fixed delay which is larger thannecessary is applied to the DDPD data. This is why a delay of the ADCdata is required. In practice, the 1/R_(s) variable delay can be appliedonly to the DDPD data.

Feedback Distortion Compensator (Equalizer).

The Feedback Equalizer 830 is used to compensate amplitude and otherdistortion of the analog feedback components. This section covers boththe filter used and a process of determining the filter. Block 830 issimilar to the equalizer on the transmit side to compensate for knowntilts, phase changes, and the like.

Feedback Signal Capture RAM.

The output of the IFTBB block is a baseband version of the PA signal.The signal is delayed by N_(FBdelay) samples, and L samples of thiscomplex signal are captured in the Feedback Signal Capture RAM 840 to beused in the TX/FB AGC Processor and DDPD Coefficient Estimator. The AGCprocessor can use a real-time level estimator instead of the datacapture. If the later discussed A and B matrix computation takes placeat the data rate speed, no data capture is required. The data goesstraight into the weight computation engine.

DDPD/FB Correlator.

In the DDPD system, the transmitted data and the feedback data aresupposed to look the same. However, in reality the feedback dataexperiences time delay as it passes through a digital filter, the PA,and back into a capture RAM. Because of this delay, the transmit andfeedback data must be time-aligned (correlated).

To time-align the transmit and feedback data, a process referred to ascorrelation is performed in which particular delays are applied to the Iand Q samples of the transmit and feedback paths, and an evaluation isperformed to determine which delays cause the signals to line up. Todetermine if the signals line up for a given delay, a complexmultiplication of the conjugate of the transmit signal samples X(n)*with the feedback signal samples Y(n) is performed, and the results aresummed up. The correlation will be greatest at the delay that producesthe largest result. To identify this delay, the transmit or feedbacksignal can be delayed by various amounts or swept over the full range ofpossible delays, and the delay at which the maximum correlation wasfound can be identified. This process is typically done only once atstartup, although in other embodiments it could be done more than once,during operation.

FIG. 17 illustrates an exemplary DDPD/FB Correlator 850 according toembodiments of the invention. The objective of the DDPD/FB Correlator850 is to align the captured transmit signal and the captured IFTBBsignal to less than 0.5/M_(res)R_(s). The captures that are aligned arethe coefficient estimator RAMs, not the correlation RAMs. Thecorrelation RAMs are used to find the necessary time alignment settings.FIG. 17 shows a possible implementation of this correlator representinga portion of a correlation processor. The delay time steps over a rangeτ=0, 1, 2, 3, 4, . . . , N_(max) where [0 . . . N_(max)/(M_(res)R_(s))]is the maximum range of expected time difference. The processordecomposes the optimum delay, n_(opt) as:

M=modulo(n _(opt) ,M _(res))  (8.10)

LL=floor((n _(opt) /M _(res))  (8.11)

where floor(x) rounds the elements of x to the nearest integers towardsminus infinity. For example if M_(res)=4, n_(opt)=18, we would have M=2,and LL=4. The actual delay in time would be 18/(4R_(s)). The shifts LLand M are then applied at the Feedback Fine and Coarse Delay Buffers,respectively.

At any delay, N_(delta), the correlation value is computed by performingthe complex multiplication of the ddpd and adc inputs to the correlator,y_(TL)[n] and z_(FB3)[n] respectively:

$\begin{matrix}\begin{matrix}{{C\left\lbrack n_{delta} \right\rbrack} = {\sum\limits_{n = 0}^{N_{corr} - 1}{{y_{TL}\lbrack n\rbrack} \cdot {z_{{FB}\; 3}^{\prime}\left( {M_{res}*\left( {n + n_{delta}} \right)} \right)}}}} \\{= {\sum\limits_{n = 0}^{N_{corr} - 1}{\left\lbrack {{I_{TL}\lbrack n\rbrack} + {j\; {Q_{TL}\lbrack n\rbrack}}} \right\rbrack \cdot}}} \\{\left\lbrack {{I_{{FB}\; 3}\left( {M_{res}*\left( {n + n_{delta}} \right)} \right)} - {j\; {Q_{{FB}\; 3}\left( {M_{res}*\left( {n + n_{delta}} \right)} \right)}}} \right\rbrack} \\{= {{\sum\limits_{n = 0}^{N_{corr} - 1}{{I_{TL}\lbrack n\rbrack}{I_{{FB}\; 3n}\left( {M_{res}*\left( {n + n_{delta}} \right)} \right)}}} +}} \\{{{{Q_{TL}\lbrack n\rbrack}{Q_{{FB}\; 3}\left( {M_{res}*\left( {n + n_{delta}} \right)} \right)}} +}} \\{{{j\begin{bmatrix}{{{I_{{FB}\; 3}\left( {M_{res}*\left( {n + n_{delta}} \right)} \right)}{Q_{TL}\lbrack n\rbrack}} -} \\{{Q_{{FB}\; 3}\left( {M_{res}*\left( {n + n_{delta}} \right)} \right)}{I_{TL}\lbrack n\rbrack}}\end{bmatrix}}\sqrt[\square]{\square}}}\end{matrix} & (8.12)\end{matrix}$

The N_(delta) which maximizes |C[n_(delta)]| is the optimum delay,n_(opt). This process aligns the samples z[n]=I_(FB3)[n]+j Q_(FB3)[n]with DDPD signal y[n]=I_(TL)[n]+j Q_(TL)[n] to within0.5/(M_(res)R_(s)). The actual delays used in an implementation may varyfrom these to account for latencies in the design.

Note that the complex multiplications are shown at 852, and thesummations are shown as I&D N_(corr) blocks 854. Because this processmay only need to be performed once, the multipliers in FIG. 17 may betime-shared and reused for other functions in a quasi-hardwareimplementation. Note that correlation of the transmit and feedback datais also described in U.S. patent application Ser. No. 11/150,445entitled “Digital Pre-Distortion Technique Using Nonlinear Filters.”

TX/FB AGC Processor.

In order for the DDPD system to work effectively, the gain of the systemfrom digital input to TX output must be monitored and adjusted to keepthe gain constant. This can be done in many ways but for the best mode,the gain adjustments should come after the DDPD engine and must be smallchanges (usually less than 0.05 dB steps) that occur periodically. Inone embodiment, a TX/FB AGC Processor 860 monitors the gain and keeps itconstant. Optimally at least ten DDPD Coefficient updates have occurredbetween changes in order to keep a good linearization solution as thegain is changed.

The objective of the TX/FB AGC Processor 860 is to maintain the outputof the PA at the correct level. In practice, the gain of the transmit RFupconverter, downconverter, and the PA will be changing overtemperature. When this occurs, the power level at the output of the PAmay no longer be accurate to within an acceptable level. Thus, AGC isrequired to maintain to the correct PA output level.

Feedback Gain Scaler.

The purpose for the feedback gain scaler block 880 is to adjust thesignal amplitude at the Feedback path to be the same (or a known ratio)as the transmit path so that the DDPD coefficients will produce a knowngain. This scaler will be updated based on the average power levels ofthe captured data. This scaling can also be done at the end of thecomputation and applied to the weights. Scaling the weights requiresmore complexity however. This scalar can also be implemented before thedata capture.

Sample Shift.

Shifting the time aligned ADC data with respect to the DDPD data inSample Shift block 890 has the effect of shifting the location of thedominant weight. The distribution of the coefficient set W impacts theaccuracy of the DDPD engine. In order to optimize the predistortionsolution, it is necessary to place the coefficient distribution mostoptimally by experimentally changing the delay, d, from 0 to q¹⁻¹, whereq₁ is the length of fundamental filter. By checking the performance ofthe system for each delay, the optimal setting is obtained. For ease ofnotation, the output of the Sample Shift is still expressed asz[n]=delay{z[n]}. This Sample Shift can also be implemented before thedata capture.

DDPD Coefficient Estimator

FIG. 18 illustrates an exemplary DDPD Coefficient Estimator 900according to embodiments of the invention. The objective of the DDPDCoefficient Estimator block 900 is to compute a set of DDPD Enginecoefficients, W, used to predistort the transmit signal.

W=[w ₁₁ ,w ₁₂ , . . . , w _(1,q1) ; w ₂₁ ,w ₂₂ , . . . , w _(2,q2) ; . .. ; w _(N,1) ,w _(N,2) , . . . , w _(N,qN)].  (9.1)

The baseband version, z[n], of the output of the power amplifier (PA)can be expressed as:

z[n]=PA(y[n])=PA(DDPD(x[n]))  (9.2)

where y[n] and z[n] are the baseband version of the input and the outputof the power amplifier, respectively, x[n] is the input to the DDPDEngine, and PA( ) is the unity gain baseband equivalent of the PA. Ifthe DDPD acts like the inverse of PA, we expect to have z[n]˜x[n]. Thusfrom Equation (9.2), we have

PA(y[n])=PA(DDPD(z[n]))  (9.3)

y[n]=DDPD(z[n])  (9.4)

We have y[n] and z[n] in the data capture RAMs minus their postprocessing so writing this equation

$\begin{matrix}{{{out},{{y\lbrack n\rbrack} = {{w_{11}{z\lbrack n\rbrack}} + {w_{12}{z\left\lbrack {n - 1} \right\rbrack}} + \ldots + {w_{1,d}{z\left\lbrack {n - d + 1} \right\rbrack}} + {{z\lbrack n\rbrack}\begin{pmatrix}{{w_{1,21}{b^{k_{2} - 1}\left\lbrack {n - \lambda_{12}} \right\rbrack}} + {w_{1,22}{b^{k_{12} - 1}\left\lbrack {n - \lambda_{12} - 1} \right\rbrack}} + \ldots + {w_{1,{2q_{12}}}{b^{k_{12} - 1}\left\lbrack {n - \lambda_{12} - q_{12} + 1} \right\rbrack}} +} \\{{w_{1,31}{b^{k_{13} - 1}\left\lbrack {n - \lambda_{13}} \right\rbrack}} + {w_{1,32}{b^{k_{13} - 1}\left\lbrack {n - \lambda_{13} - 1} \right\rbrack}} + \ldots + {w_{1,{3q_{13}}}{b^{k_{13} - 1}\left\lbrack {n - \lambda_{13} - q_{13} + 1} \right\rbrack}} +} \\\ldots \\{{w_{1,{N_{1}1}}{b^{k_{1N_{1}} - 1}\left\lbrack {n - \lambda_{1N_{1}}} \right\rbrack}} + {w_{1,{N_{1}2}}{b^{k_{1N_{1}} - 1}\left\lbrack {n - \lambda_{1N_{1}} - 1} \right\rbrack}} + \ldots + {w_{1,{N_{1}q_{1N_{1}}}}{b^{k_{1N_{1}} - 1}\left\lbrack {n - \lambda_{1N_{1}} - q_{1N_{1}} + 1} \right\rbrack}}}\end{pmatrix}} + \ldots}}}{{{z\left\lbrack {n - r} \right\rbrack}\begin{pmatrix}{{w_{r,21}{b^{k_{r\; 2} - 1}\left\lbrack {n - \lambda_{r\; 2}} \right\rbrack}} + {w_{r,22}{b^{k_{r\; 2} - 1}\left\lbrack {n - \lambda_{r\; 2} - 1} \right\rbrack}} + \ldots + {w_{r,{2q_{r\; 2}}}{b^{k_{r\; 2} - 1}\left\lbrack {n - \lambda_{r\; 2} - q_{r\; 2} + 1} \right\rbrack}} +} \\{{w_{r,31}{b^{k_{r\; 3} - 1}\left\lbrack {n - \lambda_{r\; 3}} \right\rbrack}} + {w_{r,32}{b^{k_{r\; 3} - 1}\left\lbrack {n - \lambda_{r\; 3} - 1} \right\rbrack}} + \ldots + {w_{r,{3q_{r\; 3}}}{b^{k_{r\; 3} - 1}\left\lbrack {n - \lambda_{r\; 3} - q_{r\; 3} + 1} \right\rbrack}} +} \\\ldots \\{{w_{r_{1}N_{r}1}{b^{k_{{rN}_{r}} - 1}\left\lbrack {n - \lambda_{{rN}_{r}}} \right\rbrack}} + {w_{r,{N_{r}2}}{b^{k_{{rN}_{r}} - 1}\left\lbrack {n - \lambda_{{rN}_{r}} - 1} \right\rbrack}} + \ldots + {w_{r,{N_{r}q_{{rN}_{r}}}}{b^{k_{{rN}_{r}} - 1}\left\lbrack {n - \lambda_{{rN}_{r}} - q_{{rN}_{r}} + 1} \right\rbrack}}}\end{pmatrix}} + \ldots}{{z\left\lbrack {n - d} \right\rbrack}\begin{pmatrix}{{w_{d,21}{b^{k_{d\; 2} - 1}\left\lbrack {n - \lambda_{d\; 2}} \right\rbrack}} + {w_{d,22}{b^{k_{d\; 2} - 1}\left\lbrack {n - \lambda_{d\; 2} - 1} \right\rbrack}} + \ldots + {w_{d,{2q_{d\; 2}}}{b^{k_{d\; 2} - 1}\left\lbrack {n - \lambda_{d\; 2} - q_{d\; 2} + 1} \right\rbrack}} +} \\{{w_{d,31}{b^{k_{d\; 3} - 1}\left\lbrack {n - \lambda_{d\; 3}} \right\rbrack}} + {w_{d,32}{b^{k_{d\; 3} - 1}\left\lbrack {n - \lambda_{d\; 3} - 1} \right\rbrack}} + \ldots + {w_{d,{3q_{d\; 3}}}{b^{k_{d\; 3} - 1}\left\lbrack {n - \lambda_{d\; 3} - q_{d\; 3} + 1} \right\rbrack}} +} \\\ldots \\{{w_{d,{N_{d}1}}{b^{k_{{dN}_{d} - 1}}\left\lbrack {n - \lambda_{{dN}_{d}}} \right\rbrack}} + {w_{d,{N_{d}2}}{b^{k_{{dN}_{d}} - 1}\left\lbrack {n - \lambda_{{dN}_{d}} - 1} \right\rbrack}} + \ldots + {w_{d,{N_{d}q_{{dN}_{d}}}}{b^{k_{{dN}_{d}} - 1}\left\lbrack {n - \lambda_{{dN}_{d}} - q_{{dN}_{d}} + 1} \right\rbrack}}}\end{pmatrix}}} & (9.5)\end{matrix}$

where b[n]=|z[n]| and the operator, |x| means √{square root over((real(x))²+(imag(x))²)}{square root over ((real(x))²+(imag(x))²)}. Inorder to solve equation (9.5) to produce the DDPD coefficient set, W,the DDPD Coefficient Estimator processor uses the steps shown inside thedotted box in FIG. 18.

Note that in FIG. 18, the DDPD Signal Capture RAM samples and theFeedback Signal Capture RAM samples are after they have beentime-aligned as described above. The Feedback Gain Scaler 880 and SampleShift 890 are described above.

Compute Covariance Matrix A and B.

The purpose of the Compute Covariance Matrix block 910 is to computecovariance matrices A and B, which are needed to compute the weights Wneeded by the DDPD engine in its computation of a predistorted transmitsignal.

From Equation (9.5), we have:

$\begin{matrix}{W = \begin{bmatrix}v_{1} \\v_{2} \\\ldots \\v_{d} \\v_{d + 1} \\v_{d + 2} \\\ldots \\v_{d + q_{12}} \\\ldots \\v_{d + q_{12} + q_{13} + \ldots + q_{1N_{1}}} \\\ldots \\v_{d + q_{12} + q_{13} + \ldots + q_{{1N_{1}} + \ldots + q_{{rN}_{r}}}} \\\ldots \\v_{d + q_{12} + q_{13} + \ldots + q_{1N_{1}} + q_{22} + \ldots + q_{{rN}_{r}} + q_{{({r + 1})}2} + \ldots + q_{{dN}_{d}}}\end{bmatrix}} \\{= {\begin{bmatrix}w_{11} \\w_{12} \\\ldots \\w_{1d} \\w_{1,21} \\w_{1,22} \\\ldots \\w_{1,{2q_{12}}} \\\ldots \\w_{1,{Nq}_{1N_{1}}} \\\ldots \\w_{r,{Nq}_{{rN}_{r}}} \\\ldots \\w_{d,{Nq}_{{dN}_{d}}}\end{bmatrix}^{T}.}}\end{matrix}$

There are M elements in this vector, M=q₁+q_(1N)+q_(2N) . . . +q_(dN).

$\begin{matrix}{K_{L \times M} = \begin{bmatrix}{c_{1}\lbrack n\rbrack} \\{c_{2}\lbrack n\rbrack} \\\ldots \\{c_{d}\lbrack n\rbrack} \\{c_{d + 1}\lbrack n\rbrack} \\{c_{d + 2}\lbrack n\rbrack} \\\ldots \\{c_{d + q_{12}}\lbrack n\rbrack} \\\ldots \\{c_{d + q_{12} + q_{13} + \ldots + q_{1N_{1}}}\lbrack n\rbrack} \\\ldots \\{c_{d + q_{12} + q_{13} + \ldots + q_{1N_{1}} + \ldots + q_{{rN}_{r}}}\lbrack n\rbrack} \\\ldots \\{c_{d + q_{12} + q_{13} + \ldots + q_{1N_{1}} + q_{22} + \ldots + q_{{rN}_{r}} + q_{{({r + 1})}2} + \ldots + q_{{dN}_{d}}}\lbrack n\rbrack}\end{bmatrix}} \\{= \begin{bmatrix}{z\lbrack n\rbrack} \\{z\left\lbrack {n - 1} \right\rbrack} \\\ldots \\{z\left\lbrack {n - d + 1} \right\rbrack} \\{{z\lbrack n\rbrack}{b^{k_{12} - 1}\left\lbrack {n - \lambda_{1}} \right\rbrack}} \\{{z\lbrack n\rbrack}{b^{k_{12} - 1}\left\lbrack {n - \lambda_{1} - 1} \right\rbrack}} \\\ldots \\{{z\lbrack n\rbrack}{b^{k_{12} - 1}\left\lbrack {n - \lambda_{1} - q_{12} + 1} \right\rbrack}} \\\ldots \\{{z\lbrack n\rbrack}{b^{k_{{1N_{1}} - 1}}\left\lbrack {n - \lambda_{1} - q_{1N_{1}} + 1} \right\rbrack}} \\\ldots \\{{z\lbrack n\rbrack}{b^{k_{{rN}_{r}} - 1}\left\lbrack {n - \lambda_{r} - q_{{rN}_{r}} + 1} \right\rbrack}} \\\ldots \\{{z\lbrack n\rbrack}{b^{k_{{dN}_{d}} - 1}\left\lbrack {n - \lambda_{d} - q_{{dN}_{d}} + 1} \right\rbrack}}\end{bmatrix}^{T}}\end{matrix}$

More general is:

$\begin{matrix}\begin{matrix}{K_{L \times M} = \begin{bmatrix}{c_{1}\lbrack n\rbrack} \\{c_{2}\lbrack n\rbrack} \\\ldots \\{c_{d}\lbrack n\rbrack} \\{c_{d + 1}\lbrack n\rbrack} \\{c_{d + 2}\lbrack n\rbrack} \\\ldots \\{c_{d + q_{12}}\lbrack n\rbrack} \\\ldots \\{c_{d + q_{12} + q_{13} + \ldots + q_{1N_{1}}}\lbrack n\rbrack} \\\ldots \\{c_{d + q_{12} + q_{13} + \ldots + q_{1N_{1}} + \ldots + q_{{rN}_{r}}}\lbrack n\rbrack} \\\ldots \\{c_{d + q_{12} + q_{13} + \ldots + q_{1N_{1}} + q_{22} + \ldots + q_{{rN}_{r}} + q_{{({r + 1})}2} + \ldots + q_{{dN}_{d}}}\lbrack n\rbrack}\end{bmatrix}} \\{= \begin{bmatrix}{z\lbrack n\rbrack} \\{z\left\lbrack {n - 1} \right\rbrack} \\\ldots \\{z\left\lbrack {n - d + 1} \right\rbrack} \\{{z\lbrack n\rbrack}{b^{k_{12} - 1}\left\lbrack {n - \lambda_{12}} \right\rbrack}} \\{{z\lbrack n\rbrack}{b^{k_{12} - 1}\left\lbrack {n - \lambda_{12} - 1} \right\rbrack}} \\\ldots \\{{z\lbrack n\rbrack}{b^{k_{12} - 1}\left\lbrack {n - \lambda_{12} - q_{12} + 1} \right\rbrack}} \\\ldots \\{{z\lbrack n\rbrack}{b^{k_{{1N_{1}} - 1}}\left\lbrack {n - \lambda_{1N_{1}} - q_{1N_{1}} + 1} \right\rbrack}} \\\ldots \\{{z\lbrack n\rbrack}{b^{k_{{rN}_{r}} - 1}\left\lbrack {n - \lambda_{{rN}_{r}} - q_{{rN}_{r}} + 1} \right\rbrack}} \\\ldots \\{{z\lbrack n\rbrack}{b^{k_{{dN}_{d}} - 1}\left\lbrack {n - \lambda_{{dN}_{d}} - q_{{dN}_{d}} + 1} \right\rbrack}}\end{bmatrix}^{T}}\end{matrix} & (9.8)\end{matrix}$

where n=[1 2 . . . L]^(T), and L is the number of samples that is usedto compute the DDPD linearization coefficients.

The DDPD Engine weight solution is based on the equation KW=Y, where Kis a function of the input (the FB capture), Y is the output (thetransmit capture), and W are the weights needed to modify K so that itequals Y. Because K and Y are captured and known, the weights W shouldbe capable of being determined. Therefore, in matrix form,

K _(L×M) W _(M×1) =Y _(L×1)  (9.12)

where Y_(L×1)=y[n], and K_(L×m) is defined as:

$\begin{matrix}\begin{matrix}{K_{L \times M} = \begin{bmatrix}{k_{1}\lbrack n\rbrack} \\{k_{2}\lbrack n\rbrack} \\\ldots \\{k_{q_{2}}\lbrack n\rbrack} \\{k_{q_{2} + 1}\lbrack n\rbrack} \\{k_{q_{2} + 2}\lbrack n\rbrack} \\\ldots \\{k_{q_{2} + q_{3}}\lbrack n\rbrack} \\\ldots \\{k_{q_{2} + q_{3} + \ldots + q_{N - 1} + 1}\lbrack n\rbrack} \\{k_{q_{2} + q_{3} + \ldots + q_{N - 1} + 2}\lbrack n\rbrack} \\\ldots \\{k_{q_{2} + q_{3} + \ldots + q_{N - 1} + q_{N}}\lbrack n\rbrack} \\{k_{q_{2} + q_{3} + \ldots + q_{N - 1} + q_{N} + 1}\lbrack n\rbrack} \\{k_{q_{2} + q_{3} + \ldots + q_{N - 1} + q_{N} + 2}\lbrack n\rbrack} \\{k_{q_{2} + q_{3} + \ldots + q_{n - 1} + q_{N} + 3}\lbrack n\rbrack} \\{k_{q_{2} + q_{3} + \ldots + q_{N - 1} + q_{N} + 4}\lbrack n\rbrack} \\\ldots \\{k_{q_{2} + q_{3} + \ldots + q_{N - 1} + q_{N} + q_{1}}\lbrack n\rbrack}\end{bmatrix}^{T}} \\{{= \begin{bmatrix}{{\lambda \lbrack n\rbrack}{c_{1}\lbrack n\rbrack}} \\{{\lambda \lbrack n\rbrack}{c_{2}\lbrack n\rbrack}} \\\ldots \\{{\lambda \lbrack n\rbrack}{c_{q_{2}}\lbrack n\rbrack}} \\{{\lambda \lbrack n\rbrack}{c_{q_{2} + 1}\lbrack n\rbrack}} \\{{\lambda \lbrack n\rbrack}{c_{q_{2} + 2}\lbrack n\rbrack}} \\\ldots \\{{\lambda \lbrack n\rbrack}{c_{q_{2} + q_{3}}\lbrack n\rbrack}} \\\ldots \\{{\lambda \lbrack n\rbrack}{c_{q_{2} + q_{3} + \ldots + q_{N - 1} + 1}\lbrack n\rbrack}} \\{{\lambda \lbrack n\rbrack}{c_{q_{2} + q_{3} + \ldots + q_{N - 1} + 2}\lbrack n\rbrack}} \\\ldots \\{{\lambda \lbrack n\rbrack}{c_{q_{2} + q_{3} + \ldots + q_{N - 1} + q_{N}}\lbrack n\rbrack}} \\{\lambda \lbrack n\rbrack} \\{d_{1}\lbrack n\rbrack} \\{d_{2}\lbrack n\rbrack} \\{d_{3}\lbrack n\rbrack} \\\ldots \\{d_{q_{1} - 1}\lbrack n\rbrack}\end{bmatrix}^{T}};}\end{matrix} & (9.13) \\{n = \begin{bmatrix}1 & 2 & \ldots & L\end{bmatrix}^{T}} & \;\end{matrix}$

However, the equation KW=Y actually represents an overdetermined systemof equations, where there are more equations than unknowns, so there aremany solutions to the equations. For example, if K is a 8000×15 matrix,Y is a 8000×1 matrix, and W is a 15×1 matrix, there are 8000 equationsbut only 15 unknowns. There is actually no solution that solves allequations exactly, the best that can be hoped for is a solution thatcomes closest to solving all of the equations with the least error.

One well-known technique to dealing with an overdetermined system ofequations of the general form AW=B is to find the A and B which reducesthe mean squared error. In other words, find A and B so that when thedifference between the left side squared (Â2) minus the right sidesquared (B̂2) is computed for each equation and summed for all equations,the smallest possible number is obtained.

In the present application, KW=Y can be transformed into the form AW=Bby setting A=(K*)^(T)K (A equals K conjugated and transposed times K)and B=(K*)^(T)Y, where A and B are covariance matrices. Continuing theexample above, after this transformation A is now a 15×15 matrix, and Bis now a 15×1 matrix, so now we have 15 equations and 15 unknowns, andAW=B can be solved.

Thus, after multiplying both sides of equation (9.12) by K_(L×M) ^(H),equation (9.12) can be expressed in terms of the “normal” equation,

A _(M×M) W _(M×1) =B _(M×1)  (9.14)

where

A _(M×M) =K _(L×M) ^(H) K _(L×M)  (9.15)

B _(M×1) =K _(L×M) ^(H) Y _(L×1)  (9.16)

and ( )^(H) is the hermitian operator.

Diagonal Noise Adding.

Upon completion of the Compute Covariance Matrix A and B block 910, theA and B matrices have been computed. For example, A may be a 15×15matrix. However, in some instances, every equation in the system ofequations represented by A may be almost the same as every otherequation. The more alike the equations are, the more “singular” thematrix. It is preferred that matrix A not be too singular; otherwise,when attempting to solve the system of equations, no good solution maybe computed.

To stabilize the process of Gaussian elimination to solve theseequations, embodiments of the invention may add a real or complex offsetto each of the diagonal elements of the A matrix. This is referred to asdiagonal noise adding. By adding noise to the system, there is lesschance that the matrix will be singular.

To add stability to the system, a constant value can be added to all ofthe diagonal elements of the A matrix. This helps to preventsingularities during Gaussian Elimination. Mathematically,

$\begin{matrix}{{A_{M \times M} = {A_{M \times M} + K_{diag\_ noise}}}{where}} & (9.49) \\{K_{diag\_ noise} = \begin{bmatrix}\alpha_{1} & 0 & 0 & \ldots & 0 \\0 & \alpha_{2} & 0 & \ldots & 0 \\0 & 0 & \alpha_{3} & \ldots & 0 \\\ldots & \ldots & \ldots & \ldots & \ldots \\0 & 0 & 0 & \ldots & \alpha_{M}\end{bmatrix}} & (9.50)\end{matrix}$

Typically α_(i) is in the range of 1e⁻⁵*Σ|z|² to 1e⁻⁸*Σ|z|² for a fullscale input. The α_(i)'s can be changed as the signal level changes.Each a corresponds to a delay and certain order term in the A matrix andcan be optimized accordingly via experimentation.

Equation 9.50 shows the most general implementation where each elementis different, but in other embodiments, the noise can be the same. Theproper noise levels can be determined empirically. These noise elementscan be changed as signal levels and characteristics change.

Weight Conditioner: Averaging, Interpolation and Extrapolation.

A major challenge for DDPD linearization performance is in the caseswhere either the input signal transitions from low power to high power,or the input signal suddenly increases its peak amplitude. The problemis when the signal goes from low to high, or the peak amplitudeincreases suddenly, the coefficient for low power (or lower amplitude)is not usable for high power (or higher amplitude). We have to wait forthe next update (high power or higher amplitude) to have good CPL, anduntil the next coefficient is applied, the spectral emission mask may beviolated. The A and B Matrix Averaging, Interpolation, Extrapolation,and Linear Extrapolation block 930 provides processing to remedy thisproblem. The processing done in this weight conditioner block isdesigned to ensure that a more stabilized solution and weights aregenerated.

Averaging and Interpolation Processing.

In the DDPD system according to embodiments of the invention, datacaptures of the transmit and feedback paths are periodically obtained,and covariance matrices A and B and pre-distortion weights W arere-computed for each data capture. This re-computation of the covariancematrices and weights is necessary because the transmit signal powerlevels change over time, and previous matrices and weights will notnecessarily be valid at the next data capture. In other words, overtime, a series of re-computed A and B matrices can be represented asfollows:

A6 A5 A4 A3 A2 A1 B6 B5 B4 B3 B2 B1Without the process of averaging, as will be described in further detailbelow, any previously computed A and B matrices are not considered incomputing present A and B matrices and weights.

To explain the concepts of averaging and interpolation, it should firstbe understood that a power amplifier is usually operated in the linearregion of the A-A curve, where the input signal level is low and littlepre-distortion is necessary. Accordingly, suppose that input data hasonly been received in the linear region, so that the computed A and Bmatrices and pre-distortion weights only represent lower power levels.Now suppose that the situation described above occurs, and the signallevel suddenly increases to levels at which pre-distortion becomes morenecessary. Without any previous input data at these higher signallevels, the newly computed solution for A and B and the correspondingweights, calculated without the benefit of any previous data at highpower levels, may produce a solution that does not predistorteffectively and/or results in a very large input to the DAC and causessaturation of the PA. (However, the converse is not generally true—ifweights for a high signal level are being used and the signal suddenlydecreases, the weights in use will still tend to provide adequatecompensation.)

To stabilize the weights, each newly computed set of A and B matricesmay utilize previously computed A and B matrices so that if a signallevel should rapidly increase, the newly computed A and B matrices willat least be influenced to some extent by previous A and B matricescomputed at previous signal levels. If the previous signal levelsinclude some covariance matrices A and B computed based on highertransmit power levels, the newly computed A and B matrices may generatea set of more stable and acceptable, though sub-optimal, pre-distortionweights.

In an exemplary IIR filter embodiment, one or more previously computed Aand B matrices and the A and B matrices computed solely on the basis ofcurrent signal levels can both be weighted in some predetermined mannerand added together. For example, the covariance matrix A_(n) for acurrent data capture “n” may be set to equal a weighted average0.1A_(curr(n))+0.9A_(n-1), where A_(curr(n)) is an A matrix computedsolely on the basis of the current signal level and A_(n-1) is thepreviously computed covariance matrix A for the previous data capture“n−1.” A similar computation may also be performed for covariance matrixB_(n). It should be understood that the weights 0.1 and 0.9 are onlyexemplary, and that other weights could be used. Note also that becauseeach successive set of covariance matrices A_(n) and B_(n) utilizes thepreviously computed set of covariance matrices A_(n-1) and B_(n-1), thenewly computed covariance matrices effectively average infinitely intothe past. Accordingly, this averaging can be implemented in an IIRfilter as well as in a processor.

In another exemplary IIR embodiment, other previously computed sets ofcovariance matrices (e.g. A_(n-2) and B_(n-2), A_(n-3) and B_(n-3),etc.) as well as other delayed A and B matrices (e.g. A_(curr(n-2)) andB_(curr(n-2)), A_(curr(n-3)) and B_(curr(n-3)), etc.) may be utilized inthe computation of the covariance matrices A_(n) and B_(n) for a currentdata capture “n”. For example, A_(n) may be set to equal a weightedaverage0.1A_(curr(n))+0.05A_(curr(n-1))+0.15A_(n-1)+0.3A_(n-2)+0.4A_(n-3). Itshould be understood that the weights 0.1, 0.05, 0.15, 0.3 and 0.4 areonly exemplary, and that other weights could be used. Other variationsof the IIR filter implementations described above may also be utilizedaccording to embodiments of the invention.

In an exemplary FIR filter embodiment, one or more delayed A and Bmatrices and the current A and B matrices can both be weighted in somepredetermined manner and added together. For example, the covariancematrix A_(n) for a current data capture “n” may be set to equal aweighted average0.1A_(curr(n))+0.25A_(curr(n-1))+0.25A_(curr(n-2))+0.25A_(curr(n-3))+0.25A_(curr(n-4)),where A_(curr(n)) is an A matrix computed solely on the basis of thecurrent signal level, and A_(curr(n-t)) are A matrices computed solelyon the basis of signal levels present at times “n−t”, where t=1 to 4. Asimilar computation may also be performed for covariance matrix B_(n).It should be understood that the weights 0.1 and 0.25 are onlyexemplary, and that other weights could be used. Note also that becauseeach successive set of covariance matrices A_(n) and B_(n) utilizes onlya fixed number of covariance matrices A_(curr(n-t)) computed solely onthe basis of signal levels present at times “n-t”, the newly computedcovariance matrices effectively average only finitely into the past.Accordingly, this averaging can be implemented in a FIR filter as wellas in a processor. Other variations of the FIR filter implementationdescribed above may also be utilized according to embodiments of theinvention.

Also, to effectively handle the signal transition from low power to highpower, and for signals that occasionally increase in peak amplitude,signal combining is done between data from previous high inputs andcurrent low inputs in a process referred to as interpolation. In otherwords, interpolation, as defined herein, is the process of usingprevious data captured at a different signal level in order to produce abetter solution (e.g. using previous data captured at higher signallevels to compute A and B matrices for low signal levels, or viceversa).

Extrapolation Processing.

In normal operation, the DDPD signal and the feedback signals arecaptured and the weights, W, are derived based on these data captures.Let A_(H) be the maximum amplitude of a particular Feedback datacapture. The polynomial coefficient set derived from this capture caneffectively cancel the PA non-linearity and distortion for signals thathave amplitudes equal to or lower than A_(H). However, the polynomialcoefficients are not well defined for signal amplitudes that exceedA_(H). In practice, signal amplitude peaks may occasionally exceed A_(H)and the DDPD coefficients may not correct as well for these high peaksamples. This can result in very poor CPL for short durations.

Extrapolation is another methodology for improving the generation ofpre-distortion weights when the signal level suddenly goes large with noprevious track record of covariance matrices A and B computed at highpower levels and therefore no previous pre-distortion at those levels.Without any previous history, the solution can be poor and possiblycreate larger outputs causing saturation of the PA and glitches on theoutput, which can create spurs and noise that exceeds specified levels.

The extrapolation process can effectively correct for this problem byproducing a “pseudo” set of data having the highest expected signalamplitude. Essentially, unlike averaging or interpolation, extrapolationcreates fictitious data points at high power levels to help ingenerating a more stable solution. This would force the DDPD solution tobehave well for signals with large peaks.

In the situation where extrapolation is applicable, covariance matricesA and B have been computed only up to a certain power level. Power levelranges or “bins” near these power levels can be established, and after acertain number of data captures have occurred in the highest bin, thepreviously computed A and B matrices within that bin can be averaged toobtain A_(ave) and B_(ave). Extrapolated A and B matrices, A_(extrap)and B_(extrap), can then be computed by multiplying A_(ave) and B_(ave)by particular scaling factors, which may be the same or different.A_(extrap) and B_(extrap) provide a fictitious data point at a highpower level for use in computing covariance matrices A and B at lowerpower levels, which may optionally utilize the averaging and weightingmethodologies discussed above. For example, perhaps (0.001)A_(extrap)and (0.001)B_(extrap) may be added into the averaged and/or interpolatedA and B matrices as described above. Optionally, because factoring in anextrapolated high power data point will make lower power solutions lessaccurate, if no high power levels have been received for somepredetermined period of time, A_(extrap) and B_(extrap) can be scaledback even further (by reducing the scaling factors) so thatextrapolation will have less of a negative effect at lower power levels.

In addition to computing A_(extrap) and B_(extrap), the phase can berotated, which can help approximate a more accurate solution, becausethe PA itself performs some amount of phase rotation at high powerlevels.

An alternative extrapolation methodology from the one described aboveextrapolates the captured transmit and feedback signals before computingthe corresponding A and B matrices. In this process, let y_(H)[n] andz_(H)[n] be the captured DDPD signal and feedback signal when the signalis high, respectively. The extrapolation process then amplifies y_(H)[n]by a factor of g₁, and amplifies and rotates the z_(H)[n] by a factor ofg₂e^(−jθ).

FIGS. 19 a and 19 b show the transmit and feedback signal amplitudes ofthe data before (see reference character 948) and after applying thescaling (see reference character 952). This essentially extends the AA(amplitude to amplitude) curve for larger inputs and even possibly intothe saturation zone. The AA curve is normalized to unity gain.

FIG. 20 a shows the theoretical extension of the AA curve. The extension(see reference character 938) is created simply by extending a line tothe end of the curve that has the same slope at the maximum power as thedata. However, for an easy practical implementation it is far easier toapply a gain to both data sets related to their slope at the peak of thedata capture.

FIG. 20 b shows this approach. The extrapolated AA curve using thismethod (see reference character 942) now is less accurate for small datasets. The AP curve plots the input amplitude versus output phase minusinput phase.

FIGS. 21 a and 21 b show example AP curves. Note that the angledecreases as the input amplitude increases. For the best extrapolation,this curve would be extended as shown in FIG. 21 a (see referencecharacter 944). However, for an easy practical implementation, aconstant phase rotation of −θ is performed on the data. FIG. 21 b showsthe resulting curve that ends at the desired location (see referencecharacter 946), but for smaller input powers does not track the optimumextrapolation AP curve very well.

The amplified and rotated signals are written as,

y _(E) [n]=g ₁ y _(H) [n]  (9.58)

z _(E) [n]=g ₂ e ^(−jθ) z _(H) [n]  (9.59)

Based on the amplified signals, y_(E)[n] and z_(E)[n], the correspondingcovariance matrices can be obtained, A_(E) and B_(E).

FIG. 22 illustrates a block diagram of the averaging, interpolation andextrapolation that is used to stabilize the DDDP coefficient solution.First, in block 936 it must be determined whether a processor or otherconfiguration logic has set up an extrapolation mode. If so, theextrapolated matrices A_(E) and B_(E) are updated instead. The computeblock 956 performs the phase rotation of B only, while the multiplierblocks 958 apply a programmable scaling factor. If no extrapolation modehas been set up, averaging and interpolation may be performed in block936 to generate averaged and interpolated A_(I) and B_(I) values. FromFIG. 22, we see A_(E) and B_(E) add to the averaged and interpolatedA_(I) and B_(I) (see reference character 954).

We can now write the new “normal” equation using the extrapolationmatrices as:

[A _(A,I)+ρ_(E) A _(E) ]W _(A,I,E) =[B _(A,I)+ρ_(E) B _(E)]  (9.60)

where ρ_(E) is some programmable parameter, A_(A,I) and B_(A,I) are theaveraged, interpolated A and B matrices, and W_(A,I,E) are the resultingweights. Since neither the practical AA nor AP curves track thetheoretical very well at lower power levels, we add these new A_(E) andB_(E) matrices at a lower level than the original ones. Thus ρ_(E) isusually much smaller than ρ_(L) or ρ_(H). The new DDPD solution isexpressed as:

A _(A,I,E) W _(A,I,E) =B _(A,I,E)  (9.61)

And the new combined interpolation/extrapolation of A and B now becomes

A _(A,I,E) =A _(A,I)+ρ_(E) A _(E)  (9.62)

B _(A,I,E) =B _(A,I)+ρ_(E) B _(E)  (9.63)

From Equations (9.15) and (9.16), to get the A_(E) and B_(E) matrices,

A _(E) =K _(E) ^(H) K _(E)  (9.64)

B _(E) =K _(E) ^(H) Y _(E)  (9.65)

When z is rotated by an angle, z_(E)[n]=g₂e^(−jθ)z_(H)[n], there is nochange to A_(E) from this angle change. This can be seem by looking atEquation (9.13). Each element of K_(E) will have an e^(−jθ) term factor.Each element of K_(E) ^(H) will have an e^(jθ) factor. The rotationscancel. In a similar manner, since each element if K_(E) ^(H) has afactor of e^(jθ), it can be factored out of the B_(E) computation andapplied after the multiplication saving a lot of multiplies. If forz_(E)[n]=g₂z_(H)[n] (no phase rotation),B_(E,no rotate)=K_(E, no rotate) ^(H)Y_(E), then,

B _(E) K _(E) ^(H) Y _(E) =e ^(jθK) _(E,no rotate) ^(H) Y _(E) =e ^(jθ)B _(E,no rotate) ^(H)  (9.66)

The Block diagram of the combined interpolated/extrapolation of A&B isshown in FIG. 22. This shows just one possible implementation.

In order to find the slope to apply to the input data, the followingprocedure is followed. First, read 1000 (or some number) of samples fromthe time-aligned data (remove and data shift), z_(H) and y_(H). If√{square root over (y_(H)*[i]y_(H)[i])}>T2, then:

ddpd_sum_(—) T2=ddpd_sum_(—) T2+√{square root over (y _(H) *[i]y _(H)[i])},

adc_sum_(—) T2=adc_sum_(—) T2+√{square root over (z _(H) *[i]z _(H)[i])}, and

T2count=T2count+1.

Else if √{square root over (y _(H) *[i]y _(H) [i])}>T1, then:

ddpd_sum_(—) T1=ddpd_sum_(—) T1+√{square root over (y _(H) *[i]y _(H)[i])},

adc_sum_(—) T1=adc_sum_(—) T1+√{square root over (z _(H) *[i]z _(H)[i])}, and

T1count=T1count+1.

Next, get the average coordinate in each section:

adc_ave_(—) T2=adc_sum_(—) T2/T2count,

ddpd_ave_(—) T2=ddpd_sum_(—) T2/T2count,

adc_ave_(—) T1=adc_sum_(—) T1/T1count, and

ddpd_ave_(—) T1=ddpd_sum_(—) T1/T1count.

Finally, find the slope:

Slope=(adc_ave_(—) T2−adc_ave_(—) T1)/(ddpd_ave_(—) T2−ddpd_ave_(—) T1)

Typically g1 is fixed to approximately a 0.5 to 2.5 dB increase,depending on how much more peaking than the captured data is expected,and g2=g1*Slope.

In order to find the θ to apply to the B_(E) matrix, it is easiest tosimply apply different θ's and based on performance, choose the onewhich yields the best performance. A range of 0-10 degrees should betested with five degrees the expected value.

When doing a data capture for producing the extrapolation matrices,A_(E) and B_(E), the data capture should contain a sufficient number ofpoints but can be a smaller portion of the total capture length. Alength of 1000 samples at 184.32 MHz is a typical length. By using fewersamples, the solution will work harder to fit the peak since itscontribution to the error will be a larger percentage of the error.

Finally, setting the ρ_(E) parameter can be done by testing the levelwhich works best. The range of ρ_(E) is 2⁻⁵ to 2⁻¹⁶ depending on thesignal strength and the number of samples processed to obtain theextrapolation matrix. A smaller data capture will require a largerρ_(E). A typical ρ_(E) for a data capture length of 1000 is 1/128. Asthe signal level changes, it may be optimum to vary ρ_(E) so that theextrapolation matrix does not dominate the solution. Generally, ρ_(E)should vary one-for-one with the input signal power.

Linear Extrapolation Processing.

The solution of the normal equation provides good DDPD linearization forinput signal levels that have the same or less signal power than thatused in the coefficient computation. For the purposes of thisdiscussion, let's assume that there is no interpolation or extrapolationbeing done. If the input signal amplitude exceeds the maximum amplitudeof the samples that were captured for the current DDPD coefficients,this coefficient solution can be ill-defined. The higher ordercoefficients of the lower power solution (5th or 7th or higher) willtypically have a larger magnitude than if a larger input was used in thesolution. These higher order coefficients can corrupt the linearizationof signal samples corresponding to a higher power input.

FIG. 23 shows possible performance degradation if current solution isbased on data with peaks limited to 2000 in amplitude. With an input ofamplitude 2300, the output can become unstable and performance woulddegrade. For example, suppose the current DDPD coefficients are obtainedbased on a weak transmitted signal having digital peak amplitude lessthan or equal to 2000 as shown in FIG. 23. In this case, the polynomialsolution W should linearize correctly for the signals with maximumamplitudes of 2000 or less. But if the input signal has a peak amplitudeto 2300, then the DDPD coefficients may produce a spurious DDPD engineoutput for this higher peak. LOoking at FIG. 23, the solution 962deviates greatly from the actual solution 964 for signals peaking about2000. In this case, the output of the PA can have a very high spectralamplitude that violates the SEM requirements. The objective is tomaintain the higher order coefficients (5th or 7th or higher) to avoidspurious outputs.

FIG. 24 illustrates the benefits of Linear extrapolation. This figureshows a Linear Extrapolation characteristic 966 as a straight lineapproximating a linear gain PA. By adding in this characteristic at avery low level to the actual data, the solution is prevented fromproducing large, incorrect outputs. The resultant DDPD solution whenLinear Extrapolation is applied is shown at 968. There is still errorfrom the actual solution, but we avoid very large outputs which cansaturate or damage the PA. FIG. 24 shows how Linear Extrapolation canproduce a more stable solution for points beyond the captured data.

FIG. 25 illustrates another benefit of Linear Extrapolation for lowpower inputs. Solving the normal equation in this case produces large,unstable high order term coefficients. For example, suppose the currentDDPD coefficients are obtained based on a weak transmitted signal suchas in FIG. 25 curve (c). In this case, the polynomial solution W shouldlinearize correctly for signals with similarly low power level. But ifthe input signal suddenly increases, then the DDPD solution may producean incorrect predistortion signal for this higher power case. Curve (c)in FIG. 25 deviates greatly from the actual solution for high powersignals, curve (b). In this case of suddenly increased input power, theoutput of the PA can have a very high spectral amplitude that violatesthe SEM requirements. When linear extrapolation is done, the solutiondoes not shoot up, but instead follows the thick gray trace labeled as(e) continuing to the dot which is typically set to the maximum inputexpected. For unexpected inputs larger than the input level of the dot,the output is still ill-defined. This is shown by the curvy line 970past the dot up to the arrowhead at point (d).

A different method to keep the solution stable is to force thehigh-order terms to zero. This requires some logic and may haveundesirable effects if the signal level changes rapidly back and forthand the order of the solution is also changing constantly. This alsorequires the detection of signal amplitude and threshold comparisons.

The desired approach for solving both the high input problem and the lowinput signal problem is the use of Linear Extrapolation (LE) that willmaintain the polynomial's high order terms to be sufficiently accuratewhile not causing instability or degrading performance when the inputsignal is low. Using Linear Extrapolation can supplant zeroing off highorder terms and improves the DDPD performance stability. This allows usto always solve for all order terms (i.e., don't have to have differentpolynomial orders for different signal levels). This linearextrapolation is essentially forcing a fixed slope of the A-A and zeroslope of the AP curve preventing ill behavior of the polynomial solutionwhen signals exceeds the captured level.

In Linear extrapolation, instead of scaling A and B by different amountsin an attempt to account for the nonlinear nature of the curve at highpower levels, A and B are scaled so that the curve remains linear athigh power levels. The processing steps of Linear Extrapolation are asfollows. First, linearize the system. During a usual data capture,continue all the usual processing but increase y[n] by 0-3 dB and thenset z[n]=y[n] (i.e. set the input equal to the output). Go through theSample Shift and compute the Linear Extrapolation normal matrices (i.e.,covariance matrices) A_(LE) and B_(LE). Second, multiply the matrixB_(LE) by a exp(jθ), where θ is the phase of B_(H)(1), where B_(H) isthe high covariance matrix. This forces the Linear Extrapolation to bephase aligned to the input signal phase. An additional phase shift canbe applied which could help performance. Third, scale A_(LE) and B_(LE)by the same amount, such as a factor of ρ_(LE) which is very small,e.g., ρ_(LE)=2⁻¹⁵ to 2⁻²⁰. This factor can vary as signal power varies.This produces extrapolated A and B matrices at a high power level and acurve that follows a unity curve (is on a straight line). Such an Amatrix has very small off-diagonal values and very large diagonalvalues. As the off-diagonal values reduce to zero, it becomes a diagonalnoise matrix. These A and B matrices can then be phase-rotated. Finally,add the matrices A_(LE) and B_(LE) to the extrapolation matrices toobtain the final matrices, A_(F) and B_(F), that are ready for GaussianElimination Processing as detailed next.

FIG. 26 illustrates the implementation block diagram of the normalmatrices combining. In Normal Matrices Combining, the normal matricesA_(F) and B_(F) are obtained by the weighted combination of the LinearExtrapolation, Extrapolation and Interpolation/Averaging. Thiscombination is expressed as follows:

A _(F)=ρ_(LE) A _(LE)+ρ_(E) A _(E) +A _(A,I)  (9.69)

B _(F)=ρ_(LE) B _(LE)+ρ_(E) B _(E) +B _(A,I)  (9.70)

A _(F) W _(F) =B _(F)

where A_(F) and B_(F) are the normal matrices of all of the processingto this point, A_(LE) and B_(LE) are the normal matrices for linearextrapolation (these are pre-computed and fixed) (see referencecharacter 972), A_(E) and B_(E) are the normal matrices forextrapolation of high amplitude data capture, A_(A,I) and B_(A,I) arethe normal matrices for the averaged and interpolated data, ρ_(LE) isthe weight for A_(LE) and B_(LE), and ρ_(E) is the weight for A_(E) andB_(E).

Zero Off High Order Terms.

In the Zero Off High Order Terms block 940, when the signal level islow, there is no significant distortion energy in most of the spectrum.The high order weights tend to become unstable in this case. One methodto correct this problem is to zero the high order polynomial terms asthe signal drops. Predetermined signal level thresholds can beexperimentally determined, and depending on the actual signal level, allor some of the high order terms can be set to zero, leaving only alinear solution or at least a solution with fewer non-linear terms.

Gaussian Elimination Processing.

The DDPD normal equation is expressed as:

AW=B  (9.72)

where the A and B of this equation are A_(G) and B_(G). To simplify thenotation we will use A and B instead. This equation is composed of Mlinear equations which need to be solved to obtain W. The most practicalmethod for solving W is via complex Gaussian Elimination, which is amethod for solving multiple equations (e.g. 15 equations) havingmultiple unknowns (e.g. 15 unknowns). In embodiments of the presentinvention, Gaussian Elimination is performed in the Gaussian EliminationProcessing block 950.

Other Solution Techniques.

Gaussian Elimination is shown as the method to derive the coefficients.Other methods including Gauss-Jordan elimination and the iterativeGauss-Seidel method may also be employed.

DDPD Coefficient Validator.

The computed coefficients (weights) may not necessarily produce correctcoefficients for the DDPD engine. This may be due to the fixed-bitarithmetic precision in the computation process or the data may besaturated or corrupted in some way. The objective of the DDPDCoefficient Validator block 960 is to check the coefficients before theyare applied to the DDPD engine. This would prevent the possibility thatthe coefficients are inaccurate because of saturation or truncationeffects during the computation process. To ensure the computedcoefficient set, W, is good, any coefficient verification or validationmethod known to those skilled in the art can be used.

Gaussian Elimination Validator.

Alternatively, a method referred to herein as the Gaussian EliminationValidator may be utilized according to embodiments of the invention. TheGaussian Elimination Validator methodology checks the GaussianElimination process by substitution of W. Since Gaussian Eliminationsolves the normal equation, AW=B, then the result of B−AW should besmall relative to the size of the elements of B. The process to verifyif Gaussian Elimination produced an accurate solution is as follows.First, compute ΔB=B−AW, and compute the normalized error

$E = \frac{{\Delta \; B}}{B}$

Next, reject W if E>ζ. If W is rejected, the current coefficient setwill continue to be applied. If W passes the test, the new weights areloaded into the DDPD engine and the process repeats for the nextcoefficients.

Although embodiments of this invention have been fully described withreference to the accompanying drawings, it is to be noted that variouschanges and modifications will become apparent to those skilled in theart. Such changes and modifications are to be understood as beingincluded within the scope of embodiments of this invention as defined bythe appended claims.

What is claimed is:
 1. A method for an apparatus for delaying apre-distorted transmit signal of data samples at a given data rate priorto its transmission by a distorting element or distorting system,comprising: routing successive data samples of the transmit signal to Nparallel data paths in a round-robin fashion, each of the N paralleldata paths being independently configurable to produce a delay of zeroor one delay unit, wherein one delay unit is equal to a period of thegiven data rate multiplied by N; connecting each of the N parallel datapaths to N parallel data inputs of an N×N crossbar, the crossbar beingconfigurable to route any of the N parallel data inputs to any of Nparallel data outputs of the crossbar; re-combining successive datasamples from the N parallel data paths into a single data path toreconstitute the transmit signal; and configuring the N parallel datapaths and the crossbar to produce an effective delay at thereconstituted transmit signal of zero to N−1 periods of the given datarate.
 2. An apparatus for delaying a pre-distorted transmit signal ofdata samples at a given data rate prior to its transmission by adistorting element or distorting system, comprising: N parallel datapaths being independently configurable to produce a delay of zero or onedelay unit, wherein one delay unit is equal to a period of the givendata rate multiplied by N; a first commutator coupled to the N paralleldata paths and configured for routing successive data samples of thetransmit signal to the N parallel data paths in a round-robin fashion,each of the N parallel data paths being independently configurable toproduce a delay of zero or one delay unit, wherein one delay unit isequal to a period of the given data rate multiplied by N; an N×Ncrossbar coupled to the N parallel data paths and configured forconnecting each of the N parallel data paths to N parallel data inputsof an N×N crossbar, the crossbar being configurable to route any of theN parallel data paths to any of N parallel data outputs of the crossbar;and a second commutator coupled to the N×N crossbar and configured forre-combining successive data samples from the N parallel data paths intoa single data path to reconstitute the transmit signal; wherein the Nparallel data paths and the crossbar are configurable to produce aneffective delay at the reconstituted transmit signal of zero to N−1periods of the given data rate.
 3. A computer program product fordelaying a pre-distorted transmit signal of data samples at a given datarate prior to its transmission by a distorting element or distortingsystem, and including one or more computer readable instructionsembedded on a non-transitory computer readable storage medium andconfigured to cause one or more computer processors to perform the stepsof: routing successive data samples of the transmit signal to N paralleldata paths in a round-robin fashion, each of the N parallel data pathsbeing independently configurable to produce a delay of zero or one delayunit, wherein one delay unit is equal to a period of the given data ratemultiplied by N; connecting each of the N parallel data paths to Nparallel data inputs of an N×N crossbar, the crossbar being configurableto route any of the N parallel data inputs to any of N parallel dataoutputs of the crossbar; re-combining successive data samples from the Nparallel data paths into a single data path to reconstitute the transmitsignal; and configuring the N parallel data paths and the crossbar toproduce an effective delay at the reconstituted transmit signal of zeroto N−1 periods of the given data rate.